Skip to main content

Wav2Vec 2.0

Environment Setup

info

Follow RKNN Installation to set up the environment.

Follow RKNN Model Zoo to download the example files.

Model Download

Download the ONNX model file.

X64 Linux PC
cd rknn_model_zoo/examples/wav2vec2/model/
bash download_model.sh

Model Conversion

Select the target platform.

X64 Linux PC
export TARGET_PLATFORM=rk3588

Convert the ONNX model to an RKNN model.

X64 Linux PC
cd ../python/
python convert.py ../model/wav2vec2_base_960h_20s.onnx ${TARGET_PLATFORM}

C API

Build the Example

Go to the rknn_model_zoo directory and run build-linux.sh to build.

X64 Linux PC
cd ../../..
bash build-linux.sh -t ${TARGET_PLATFORM} -a aarch64 -d wav2vec2

Sync Files to the Device

Copy the built demo directory under the install folder to the device.

X64 Linux PC
cd install/${TARGET_PLATFORM}_linux_aarch64/
scp -r rknn_wav2vec2_demo/ user@your_device_ip:target_directory

Run the Example

Export the runtime libraries to the environment variable.

Device
cd rknn_wav2vec2_demo/
export LD_LIBRARY_PATH=./lib

Run the example.

Device
./rknn_wav2vec2_demo ./model/wav2vec2_base_960h_20s.rknn ./model/test.wav
$ ./rknn_wav2vec2_demo ./model/wav2vec2_base_960h_20s.rknn ./model/test.wav
-- read_audio & convert_channels & resample_audio use: 0.616000 ms
-- audio_preprocess use: 0.464000 ms
model input num: 1, output num: 1
input tensors:
index=0, name=input, n_dims=2, dims=[1, 320000], n_elems=320000, size=640000, fmt=UNDEFINED, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
output tensors:
index=0, name=output, n_dims=3, dims=[1, 999, 32], n_elems=31968, size=63936, fmt=UNDEFINED, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
-- init_wav2vec2_model use: 705.586975 ms
-- inference_wav2vec2_model use: 3297.358887 ms

Wav2vec2 output: MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL

Real Time Factor (RTF): 3.297 / 20.000 = 0.165

Python API

Activate the virtual environment

Device
conda activate rknn

Run the Example

Copy the related files to the device and run the following commands.

Device
python wav2vec2.py --model_path ../model/wav2vec2_base_960h_20s.rknn --target ${TARGET_PLATFORM}
$ python wav2vec2.py --model_path ../model/wav2vec2_base_960h_20s.rknn --target rk3588
2026-01-16 09:12:33.885150713 [W:onnxruntime:Default, device_discovery.cc:164 DiscoverDevicesForPlatform] GPU device discovery failed: device_discovery.cc:89 ReadFileContents Failed to open file: "/sys/class/drm/card1/device/vendor"
/home/radxa/miniforge3/envs/rknn/lib/python3.12/site-packages/rknn/api/rknn.py:51: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
self.rknn_base = RKNNBase(cur_path, verbose)
I rknn-toolkit2 version: 2.3.2
--> Loading model
done
--> Init runtime environment
I target set by user is: rk3588
done
W inference: Inputs should be placed in a list, like [img1, img2], both the img1 and img2 are ndarray.
W inference: The 'data_format' is not set, and its default value is 'nhwc'!

Wav2vec2 output: MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL

    You need to be logged into GitHub to post a comment. If you are already logged in, please ignore this message.

    Radxa-docs © 2026 by Radxa Computer (Shenzhen) Co.,Ltd. is licensed under CC BY 4.0