|
| 1 | +# WeNet running on KUNLUNXIN XPU device |
| 2 | +## Introduction |
| 3 | +The below example shows how to deploy WeNet offline and online ASR models on XPUs. |
| 4 | +XPU is a core architecture 100% independently developed by KUNLUNXIN for general artificial intelligence computing. |
| 5 | + |
| 6 | +## Setup environment for XPU device |
| 7 | + |
| 8 | +Before the start, makesure you have these necessary environment |
| 9 | + |
| 10 | + XRE(XPU Runtime Environment):The basic operating environment of the XPUs |
| 11 | + includes functional modules such as chip drivers, runtime api library, and firmware tools. |
| 12 | + |
| 13 | + XDNN(XPU Deep Neural Network Library): XPU library for accelerating deep neural networks, providing high-performance DNN function library used in applications. |
| 14 | + |
| 15 | +If you would like to know more about XPUs or need any help, please contact us through the official website: |
| 16 | + |
| 17 | +https://www.kunlunxin.com.cn/ |
| 18 | + |
| 19 | +## Instruction |
| 20 | +- Step 1. Build, the build requires cmake 3.14 or above. |
| 21 | + |
| 22 | +``` sh |
| 23 | +export CXX=${your_g++_path} |
| 24 | +export CC=${your_gcc_path} |
| 25 | +export XPU_API_PATH=${your_api_path} |
| 26 | + |
| 27 | +# -r : release version; -d : debug version |
| 28 | +bash ./compile.sh -r |
| 29 | +``` |
| 30 | + |
| 31 | +- Step 2. Testing, the result is shown in the console. |
| 32 | + |
| 33 | +``` sh |
| 34 | +## set KUNLUN XPU visible device |
| 35 | +export XPU_VISIBLE_DEVICES=0 |
| 36 | +export XPUSIM_DEVICE_MODEL=KUNLUN2 |
| 37 | +## set logging level |
| 38 | +export GLOG_logtostderr=1 |
| 39 | +export GLOG_v=3 |
| 40 | +## set speech wav and model/weight/units path |
| 41 | +wav_path=${your_test_wav_path} |
| 42 | +xpu_model_dir=${your_xpu_weight_dir} |
| 43 | +units=${your_units.txt} |
| 44 | +## executive command |
| 45 | +./build/bin/decoder_main \ |
| 46 | + --chunk_size -1 \ |
| 47 | + --wav_path $wav_path \ |
| 48 | + --xpu_model_dir $xpu_model_dir \ |
| 49 | + --unit_path $units \ |
| 50 | + --device_id 0 \ |
| 51 | + --nbest 3 2>&1 | tee log.txt |
| 52 | +``` |
| 53 | + |
| 54 | +A typical output result is as following: |
| 55 | + |
| 56 | +``` sh |
| 57 | +XPURT /docker_workspace/icode-api/baidu/xpu/api/../runtime/output/so/libxpurt.so loaded |
| 58 | +I1027 06:06:21.933722 111767 params.h:152] Reading XPU WeNet model weight from /docker_workspace/icode-api/baidu/xpu/api/example/wenet-conformer/all_data/ |
| 59 | +I1027 06:06:21.934103 111767 xpu_asr_model.cc:46] XPU weight_dir is: /docker_workspace/icode-api/baidu/xpu/api/example/wenet-conformer/all_data//model_weights/ |
| 60 | +I1027 06:06:23.832731 111767 xpu_asr_model.cc:65] ======= XPU Kunlun Model Info: ======= |
| 61 | +I1027 06:06:23.832749 111767 xpu_asr_model.cc:66] subsampling_rate 4 |
| 62 | +I1027 06:06:23.832777 111767 xpu_asr_model.cc:67] right_context 6 |
| 63 | +I1027 06:06:23.832789 111767 xpu_asr_model.cc:68] sos 5538 |
| 64 | +I1027 06:06:23.832795 111767 xpu_asr_model.cc:69] eos 5538 |
| 65 | +I1027 06:06:23.832799 111767 xpu_asr_model.cc:70] is bidirectional decoder 1 |
| 66 | +I1027 06:06:23.832804 111767 params.h:165] Reading unit table /docker_workspace/icode-api/baidu/xpu/api/example/wenet-conformer/all_data/dict |
| 67 | +I1027 06:06:23.843475 111776 decoder_main.cc:54] num frames 418 |
| 68 | +I1027 06:06:23.843521 111776 asr_decoder.cc:104] Required 2147483647 get 418 |
| 69 | +I1027 06:06:23.843528 111776 xpu_asr_model.cc:116] Now Use XPU:0! |
| 70 | +I1027 06:06:23.843616 111776 xpu_asr_model.cc:173] max_seqlen is 418 |
| 71 | +I1027 06:06:23.843619 111776 xpu_asr_model.cc:174] q_seqlen is 103 |
| 72 | +I1027 06:06:23.843623 111776 xpu_asr_model.cc:175] att_dim is 512 |
| 73 | +I1027 06:06:23.843626 111776 xpu_asr_model.cc:176] ctc_dim is 5538 |
| 74 | +I1027 06:06:23.852284 111776 asr_decoder.cc:113] forward takes 7 ms, search takes 1 ms |
| 75 | +I1027 06:06:23.852383 111776 asr_decoder.cc:194] Partial CTC result 甚至出现交易几乎停滞的情况 |
| 76 | +I1027 06:06:23.852530 111776 asr_decoder.cc:194] Partial CTC result 甚至出现交易几乎停滞的情况 |
| 77 | +I1027 06:06:23.852537 111776 xpu_asr_model.cc:248] num_hyps is 3 |
| 78 | +I1027 06:06:23.852541 111776 xpu_asr_model.cc:249] beam_size is 3 |
| 79 | +I1027 06:06:23.852545 111776 xpu_asr_model.cc:250] new_bs is 3 |
| 80 | +I1027 06:06:23.852545 111776 xpu_asr_model.cc:251] max_hyps_len is 14 |
| 81 | +I1027 06:06:23.853902 111776 asr_decoder.cc:84] Rescoring cost latency: 1ms. |
| 82 | +I1027 06:06:23.853911 111776 decoder_main.cc:72] Partial result: 甚至出现交易几乎停滞的情况 |
| 83 | +I1027 06:06:23.853914 111776 decoder_main.cc:104] test Final result: 甚至出现交易几乎停滞的情况 |
| 84 | +I1027 06:06:23.853924 111776 decoder_main.cc:105] Decoded 4203ms audio taken 10ms. |
| 85 | +test 甚至出现交易几乎停滞的情况 |
| 86 | +I1027 06:06:23.853984 111767 decoder_main.cc:180] Total: decoded 4203ms audio taken 10ms. |
| 87 | +``` |
0 commit comments