Object Detection Model: YOLOv5

tip

This document demonstrates how to run the YOLOv5 object detection model on Allwinner T527/A733 series chips.

This example uses the pre-trained ONNX format model from ultralytics/yolov5 as an example to demonstrate the complete process of converting the model for inference on the board.

Deploying YOLOv5 on the board requires two steps:

Use the ACUITY Toolkit on the PC to convert models from different frameworks into NBG format.
Use the awnn API on the board to perform inference with the model.

Download the ai-sdk Example Repository

X86 PC / Device

git clone https://github.com/ZIFENG278/ai-sdk.git

Model Conversion on PC

tip

Radxa provides a pre-converted yolov5.nb model. Users can directly refer to YOLOv5 Inference on the Board and skip the PC model conversion section.

tip

The files used in the YOLOv5 example are already included in the ai-sdk example repository under models/yolov5s-sim.

Enter the ACUITY Toolkit Docker container.

For ACUTIY Toolkit Docker environment preparation, please refer to ACUITY Toolkit Environment Configuration

Configure environment variables:
X86 Linux PC
cd ai-sdk/models source env.sh v3 # NPU_VERSION
For A733, choose v3; for T527, choose v2.

tip
Refer to the NPU Version Comparison Table for NPU version selection.

Download the YOLOv5s ONNX model.

X86 Linux PC

mkdir yolov5s-sim && cd yolov5s-sim
wget https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5s.onnx

Fix the input size.

NPU inference only accepts fixed input sizes. Use onnxsim to fix the input size.
X86 Linux PC
pip3 install onnxsim onnxruntime onnxsim yolov5s.onnx yolov5s-sim.onnx --overwrite-input-shape 1,3,640,640

Create a quantization calibration dataset.

Use a set of images for quantization calibration. Save the image paths in dataset.txt.

X86 Linux PC

vim dataset.txt

images/COCO_train2014_000000000529.jpg
images/COCO_train2014_000000001183.jpg
images/COCO_train2014_000000002349.jpg
images/COCO_train2014_000000003685.jpg
images/COCO_train2014_000000004463.jpg
images/dog.jpg

Create model input/output files.

Use Netron to confirm the input/output names of the ONNX model.
X86 Linux PC
vim inputs_outputs.txt
```
--inputs images --input-size-list '3,640,640' --outputs '350 498 646'
```
yolov5s in/output name

Directory structure:

.
|-- dataset.txt
|-- images
|   |-- COCO_train2014_000000000529.jpg
|   |-- COCO_train2014_000000001183.jpg
|   |-- COCO_train2014_000000002349.jpg
|   |-- COCO_train2014_000000003685.jpg
|   |-- COCO_train2014_000000004463.jpg
|   `-- dog.jpg
|-- inputs_outputs.txt
|-- yolov5s-sim.onnx

Parse the model.

tip
The pegasus script is located in ai-sdk/scripts and can be copied to the models directory.

Use pegasus_import.sh to parse the model into an intermediate representation (IR). This generates yolov5s-sim.json (model structure) and yolov5s-sim.data (model weights).
X86 Linux PC
./pegasus_import.sh yolov5s-sim/

Modify the yolov5s-sim_inputmeta.yml file.

Update the scale value based on the formula scale = 1 / std.

scale = 1 / 255
scale = 0.00392157

input_meta:
  databases:
  - path: dataset.txt
    type: TEXT
    ports:
    - lid: images_208
      category: image
      dtype: float32
      sparse: false
      tensor_name:
      layout: nchw
      shape:
      - 1
      - 3
      - 640
      - 640
      fitting: scale
      preprocess:
        reverse_channel: true
        mean:
        - 0
        - 0
        - 0
        scale:
        - 0.00392157
        - 0.00392157
        - 0.00392157
        preproc_node_params:
          add_preproc_node: false
          preproc_type: IMAGE_RGB
          # preproc_dtype_converter:
            # quantizer: asymmetric_affine
            # qtype: uint8
            # scale: 1.0
            # zero_point: 0
          preproc_image_size:
          - 640
          - 640
          preproc_crop:
            enable_preproc_crop: false
            crop_rect:
            - 0
            - 0
            - 640
            - 640
          preproc_perm:
          - 0
          - 1
          - 2
          - 3
      redirect_to_output: false

Quantize the model.

Use pegasus_quantize.sh to quantize the model into uint8 format.
X86 Linux PC
./pegasus_quantize.sh yolov5s-sim/ uint8 10
Compile the model.

Use pegasus_export_ovx.sh to compile the model into NBG format.
X86 Linux PC
./pegasus_export_ovx.sh yolov5s-sim/ uint8
The NBG model is saved in yolov5s-sim/wksp/yolov5s-sim_uint8_nbg_unify/network_binary.nb.

YOLOv5 Inference on the Board

Navigate to the YOLOv5 example code directory.

Device

cd ai-sdk/examples/yolov5

Compile the Example

Device

make AI_SDK_PLATFORM=a733
make install AI_SDK_PLATFORM=a733 INSTALL_PREFIX=./

Parameter explanation:

AI_SDK_PLATFORM: Specify the SoC, options are a733, t527.
INSTALL_PREFIX: Specify the installation path.

Run the Example

Set environment variables.

Device

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/rock/ai-sdk/viplite-tina/lib/aarch64-none-linux-gnu/NPU_VERSION # NPU_SW_VERSION

tip

Specify NPU_SW_VERSION. For A733, choose v2.0; for T527, choose v1.13. Refer to the NPU Version Comparison Table for details.

Navigate to the example installation directory.

Device

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/rock/ai-sdk/viplite-tina/lib/aarch64-none-linux-gnu/NPU_VERSION # NPU_SW_VERSION
cd INSTALL_PREFIX/etc/npu/yolov5
# ./yolov5 nbg_model input_picture
./yolov5 ./model/yolov5.nb ./input_data/dog.jpg

tip

The example automatically installs the yolov5.nb model provided by Radxa. You can manually specify the path to your converted NBG model.

(.venv) rock@radxa-cubie-a7a:~/ai-sdk/examples/yolov5/etc/npu/yolov5$ ./yolov5 ./model/network_binary.nb ./input_data/dog.jpg
./yolov5 nbg input
VIPLite driver software version 2.0.3.2-AW-2024-08-30
viplite init OK.
VIPLite driver version=0x00020003...
VIP cid=0x1000003b, device_count=1
* device[0] core_count=1
awnn_init total: 5.49 ms.
  vip_create_network ./model/network_binary.nb: 3.96 ms.
input 0 dim 640 640 3 1, data_format=2, name=input/output[0], elements=1833508979, scale=0.003922, zero_point=0
create input buffer 0: 1228800
output 0 dim 85 80 80 3 1, data_format=2, name=uid_5_out_0, elements=1632000, scale=0.085919, zero_point=211
create output buffer 0: 1632000
output 1 dim 85 40 40 3 1, data_format=2, name=uid_4_out_0, elements=408000, scale=0.071616, zero_point=204
create output buffer 1: 408000
output 2 dim 85 20 20 3 1, data_format=2, name=uid_3_out_0, elements=102000, scale=0.072006, zero_point=196
create output buffer 2: 102000
memory pool size=3892224 bytes
  load_param ./model/network_binary.nb: 0.97 ms.
  prepare network ./model/network_binary.nb: 2.56 ms.
  set network io ./model/network_binary.nb: 0.01 ms.
awnn_create total: 7.55 ms.
yolov5_preprocess.cpp run.
memcpy(0xffff89621000, 0xffff886f8010, 1228800)  load_input_data: 0.33 ms.
  vip_flush_buffer input: 0.02 ms.
awnn_set_input_buffers total: 0.38 ms.
  vip_run_network: 17.07 ms.
  vip_flush_buffer output: 0.01 ms.
    int8/uint8 1632000 memcpy: 2.72 ms.
    int8/uint8 408000 memcpy: 0.46 ms.
    int8/uint8 102000 memcpy: 0.11 ms.
  tensor to fp: 28.64 ms.
awnn_run total: 45.75 ms.
yolov5_postprocess.cpp run.
detection num: 3
16:  86%, [ 130,  222,  312,  546], dog
 7:  59%, [ 469,   78,  692,  171], truck
 1:  53%, [ 158,  133,  560,  424], bicycle
awnn_destroy total: 1.95 ms.
awnn_uninit total: 0.66 ms.

The inference result is saved in result.png.

YOLOv5s demo output

Download the ai-sdk Example Repository​

Model Conversion on PC​

YOLOv5 Inference on the Board​

Compile the Example​

Run the Example​

Download the ai-sdk Example Repository

Model Conversion on PC

YOLOv5 Inference on the Board

Compile the Example

Run the Example