Whisper

Whisper 是由 OpenAI 推出的开源通用语音识别模型。它通过 68 万小时的大规模多语种数据预训练，具备极强的鲁棒性，能够从容应对复杂背景噪声和各类口音。

核心特点：支持高精度的多语种语音转文字、语种自动检测以及语音翻译。
版本说明：本案例采用 Whisper Medium Multilingual 模型。作为家族中的中量级成员，它在保证中文及多语言识别准确率的同时，兼顾了推理效率，是目前兼具性能与速度的主流平衡选择。

环境配置

需要提前配置好相关环境。

快速开始

下载模型

O6 / O6N

cd ai_model_hub_25_Q3/models/Audio/Speech_Recognotion/onnx_whisper_medium_multilingual
wget -O whisper_medium_multilingual_decoder.cix https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/Audio/Speech_Recognotion/onnx_whisper_medium_multilingual/whisper_medium_multilingual_decoder.cix
wget -O whisper_medium_multilingual_encoder.cix https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/Audio/Speech_Recognotion/onnx_whisper_medium_multilingual/whisper_medium_multilingual_encoder.cix

安装依赖

O6 / O6N

sudo apt update
sudo apt install ffmpeg

模型测试

信息

运行前激活虚拟环境！

O6 / O6N

python3 inference_npu.py

完整转换流程

下载模型文件

Linux PC

cd ai_model_hub_25_Q3/models/Audio/Speech_Recognotion/onnx_whisper_medium_multilingual/model
wget -O whisper_medium_multilingual_decoder.onnx https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/Audio/Speech_Recognotion/onnx_whisper_medium_multilingual/model/whisper_medium_multilingual_decoder.onnx
wget -O whisper_medium_multilingual_encoder.onnx https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/Audio/Speech_Recognotion/onnx_whisper_medium_multilingual/model/whisper_medium_multilingual_encoder.onnx

项目结构

.
├── cfg
├── datasets
├── inference_npu.py
├── inference_onnx.py
├── model
├── ReadMe.md
├── test_data
├── whisper
├── whisper-medium
├── whisper_medium_multilingual_decoder.cix
└── whisper_medium_multilingual_encoder.cix

进行模型量化和转换

转换编码器部分

Linux PC

cd ..
cixbuild cfg/whisper_medium_multilingual_encoder/whisper_medium_multilingual_encoder_build.cfg

转换解码器部分

Linux PC

cixbuild cfg/whisper_medium_multilingual_decoder/whisper_medium_multilingual_decoder_build.cfg

推送到板端

完成模型转换之后需要将 cix 模型文件推送到板端。

测试主机推理

安装 ffmpeg

Linux PC

sudo apt update
sudo apt install ffmpeg

运行推理脚本

Linux PC

python3 inference_onnx.py

模型推理结果

会在 output 目录下生成 test_audio_npu.txt 文件。

They regain their apartment, apparently without disturbing the household of Gainwell.

进行 NPU 部署

安装 ffmpeg

O6 / O6N

sudo apt update
sudo apt install ffmpeg

运行推理脚本

O6 / O6N

python3 inference_npu.py --backend npu --encoder_model_path whisper_medium_multilingual_encoder.cix --decoder_model_path whisper_medium_multilingual_decoder.cix

模型推理结果

O6 / O6N

$ python3 inference_npu.py --backend npu --encoder_model_path whisper_medium_multilingual_encoder.cix --decoder_model_path whisper_medium_multilingual_decoder.cix
2025-12-29 10:55:26.758036920 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card3/device/vendor"
npu: noe_init_context success
npu: noe_load_graph success
Input tensor count is 1.
Output tensor count is 1.
npu: noe_create_job success
npu: noe_clean_job success
npu: noe_unload_graph success
npu: noe_deinit_context success
npu: noe_init_context success
npu: noe_load_graph success
Input tensor count is 5.
Output tensor count is 2.
npu: noe_create_job success
npu: noe_clean_job success
npu: noe_unload_graph success
npu: noe_deinit_context success

会在 output 目录下生成 test_audio_npu.txt 文件。

They regain their apartment, apparently without disturbing the household of Gainwell.

快速开始​

下载模型​

安装依赖​

模型测试​

完整转换流程​

下载模型文件​

项目结构​

进行模型量化和转换​

测试主机推理​

安装 ffmpeg​

运行推理脚本​

模型推理结果​

进行 NPU 部署​

安装 ffmpeg​

运行推理脚本​

模型推理结果​

快速开始

下载模型

安装依赖

模型测试

完整转换流程

下载模型文件

项目结构

进行模型量化和转换

测试主机推理

安装 ffmpeg

运行推理脚本

模型推理结果

进行 NPU 部署

安装 ffmpeg

运行推理脚本

模型推理结果