YOLO11 Seg
This document describes how to run YOLO11 Seg on NPU.
Refer to Model Zoo Download for the example.
YOLO11 Seg Example Directory Structure:
$ tree ./
./
├── CMakeLists.txt
├── convert_model
│ ├── config_yml.py
│ ├── convert_model_env.sh
│ ├── python
│ │ ├── onnx_extract.py
│ │ └── yolo11s-seg_640.txt
│ └── yolo11s-seg_10.txt
├── figures
│ ├── diff_img.png
│ └── out_yolo11_seg_pcq.png
├── main.cpp
├── model
│ └── dog.jpg
├── model_config.h
├── README.md
├── yolo11_seg_10_post.cpp
└── yolo11_seg_10_pre.cpp
Model Conversion
Configure Virtual Environment
python -m venv .venv && source .venv/bin/activate
pip install ultralytics
Export ONNX Model
cd convert_model/python/
yolo export model=yolo11s-seg.pt format=onnx imgsz=640 dynamic=False simplify=True opset=11 nms=False batch=1 device=cpu
Prune Model
python onnx_extract.py
mv yolo11s-seg_10.onnx ../
cd ..
Create Symlink for Conversion Script
./convert_model_env.sh
Model Import/Quantization/Conversion
You need to enter the container development environment first. Refer to the Create Container section in Model Zoo Download.
Different platforms use corresponding Docker images:
- A733: ubuntu-npu:v2.0.10.1
- T527: ubuntu-npu:v1.8.11
docker exec -it model-zoo /bin/bash
After entering the container, navigate to the corresponding directory and run the script.
cd /workspace/examples/yolo11_seg/convert_model/
./pegasus_import.sh yolo11s-seg_10
./pegasus_quantize.sh yolo11s-seg_10 uint8 12
- A733
- T527
./pegasus_export_ovx_nbg.sh yolo11s-seg_10 uint8 a733
./pegasus_export_ovx_nbg.sh yolo11s-seg_10 uint8 t527
The exported model files are stored in the ../model directory.
Compile Example
Now you can compile the example. First exit the container, then execute the following command to compile the example.
First, you need to configure third-party libraries and cross-compilation toolchain.
You can skip this step if you have already configured third-party libraries and cross-compilation toolchain in other examples.
cd ../../../3rdparty/opencv/
unzip opencv-4.9.0-aarch64-linux-sunxi-glibc.zip
cd ../../0-toolchains/
You need to manually download via this link first, then place it in 0-toolchains/ before executing the following command:
tar -xvf gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz
cd ../examples/yolo11_seg/
- A733
- T527
../build_linux.sh -t a733 -s debian11
../build_linux.sh -t t527 -s debian11
Model Deployment
After compilation, the example will be installed in the install directory. You can use scp to transfer it to the board.
Configure NPU Driver
You can skip this step if you have already configured NPU driver in other examples.
Transfer the driver library to the board's lib directory via scp.
- A733 corresponds to the common/lib_linux_aarch64/A733 directory
- T527 corresponds to the common/lib_linux_aarch64/T527 directory
Then execute the following command to export to environment variables.
echo 'export LD_LIBRARY_PATH=$HOME/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
Run Example
After configuring the driver, you can run the example.
For T527 platform, you need to first enable NPU by referring to the A5E's "Enable NPU on Board" documentation, then use the following command to grant the current user permission to use /dev/vipcore.
sudo chmod 777 /dev/vipcore
- A733
- T527
cd yolo11_seg_demo_linux_a733/
chmod +x ./yolo11_seg_demo_a733
./yolo11_seg_demo_a733 -nb model/yolo11s-seg_10_uint8_a733.nb -i model/dog.jpg
The running result is as follows:
$ ./yolo11_seg_demo_a733 -nb model/yolo11s-seg_10_uint8_a733.nb -i model/dog.jpg
model_file=model/yolo11s-seg_10_uint8_a733.nb, input=model/dog.jpg, loop_count=1, malloc_mbyte=10
VIPLite driver software version 2.0.3.2-AW-2024-08-30
input 0 dim 3 640 640 1, data_format=2, quant_format=0, name=input/output[0], none-quant
output 0 dim 80 80 64 1, data_format=0, name=uid_19_out_0b_uid_1_out_0, none-quant
output 1 dim 80 80 80 1, data_format=0, name=uid_18_out_0b_uid_1_out_0, none-quant
output 2 dim 80 80 32 1, data_format=0, name=uid_17_out_0b_uid_1_out_0, none-quant
output 3 dim 40 40 64 1, data_format=0, name=uid_16_out_0b_uid_1_out_0, none-quant
output 4 dim 40 40 80 1, data_format=0, name=uid_15_out_0b_uid_1_out_0, none-quant
output 5 dim 40 40 32 1, data_format=0, name=uid_14_out_0b_uid_1_out_0, none-quant
output 6 dim 20 20 64 1, data_format=0, name=uid_13_out_0b_uid_1_out_0, none-quant
output 7 dim 20 20 80 1, data_format=0, name=uid_12_out_0b_uid_1_out_0, none-quant
output 8 dim 20 20 32 1, data_format=0, name=uid_11_out_0b_uid_1_out_0, none-quant
output 9 dim 160 160 32 1, data_format=0, name=uid_20009_sub_uid_1_out_0, none-quant
nbg name=model/yolo11s-seg_10_uint8_a733.nb, size: 7326672.
create network 0: 24693 us.
prepare network: 2986 us.
buffer ptr: 0x226f1600, buffer size: 1228800
network: 0, loop count: 1
run time for this network 0: 37744 us.
output 0, ptr 0x2281d780, size 409600.
output 1, ptr 0x229ad800, size 512000.
output 2, ptr 0x22ba1880, size 204800.
output 3, ptr 0x22c69900, size 102400.
output 4, ptr 0x22ccd9c0, size 128000.
output 5, ptr 0x22d4aa40, size 51200.
output 6, ptr 0x22d7cac0, size 25600.
output 7, ptr 0x22d95b40, size 32000.
output 8, ptr 0x22db5000, size 12800.
output 9, ptr 0x22dc1880, size 819200.
post process time : 11 ms
detection num: 3
1: 95%, [ 126, 126, 568, 420], bicycle
16: 95%, [ 131, 221, 311, 541], dog
2: 86%, [ 467, 75, 691, 172], car
destroy npu finished.
~NpuUint.
This performance data only calculates the time consumption of model inference. Unless otherwise specified, it does not include the time consumption of pre-processing and post-processing.
| SoC | NPU | Model | Input Resolution | Network Creation Time | Network Preparation Time | Single Frame Inference Time | Post-processing Time | Total Time | Frame Rate |
|---|---|---|---|---|---|---|---|---|---|
| Allwinner A733 | Vivante VIP9000 | yolo11s-seg | 640×640 | 24.7 ms | 3.0 ms | 37.7 ms | 11.0 ms | 76.4 ms | 26.5 FPS |
cd yolo11_seg_demo_linux_t527/
chmod +x ./yolo11_seg_demo_t527
./yolo11_seg_demo_t527 -nb model/yolo11s-seg_10_uint8_t527.nb -i model/dog.jpg
The running result is as follows:
$ ./yolo11_seg_demo_t527 -nb model/yolo11s-seg_10_uint8_t527.nb -i model/dog.jpg
model_file=model/yolo11s-seg_10_uint8_t527.nb, input=model/dog.jpg, loop_count=1, malloc_mbyte=10
VIPLite driver software version 1.13.0.0-AW-2023-10-19
input 0 dim 3 640 640 1, data_format=2, quant_format=0, name=input[0], none-quant
output 0 dim 80 80 64 1, data_format=0, name=uid_20000_sub_uid_1_out_0, none-quant
output 1 dim 80 80 80 1, data_format=0, name=uid_20001_sub_uid_1_out_0, none-quant
output 2 dim 80 80 32 1, data_format=0, name=uid_20002_sub_uid_1_out_0, none-quant
output 3 dim 40 40 64 1, data_format=0, name=uid_20003_sub_uid_1_out_0, none-quant
output 4 dim 40 40 80 1, data_format=0, name=uid_20004_sub_uid_1_out_0, none-quant
output 5 dim 40 40 32 1, data_format=0, name=uid_20005_sub_uid_1_out_0, none-quant
output 6 dim 20 20 64 1, data_format=0, name=uid_20006_sub_uid_1_out_0, none-quant
output 7 dim 20 20 80 1, data_format=0, name=uid_20007_sub_uid_1_out_0, none-quant
output 8 dim 20 20 32 1, data_format=0, name=uid_20008_sub_uid_1_out_0, none-quant
output 9 dim 160 160 32 1, data_format=0, name=uid_20009_sub_uid_1_out_0, none-quant
nbg name=model/yolo11s-seg_10_uint8_t527.nb, size: 8522240.
create network 0: 26153 us.
prepare network: 11813 us.
buffer ptr: 0x38e48600, buffer size: 1228800
network: 0, loop count: 1
run time for this network 0: 94147 us.
output 0, ptr 0x38f74740, size 409600.
output 1, ptr 0x391047c0, size 512000.
output 2, ptr 0x392f8880, size 204800.
output 3, ptr 0x393c0900, size 102400.
output 4, ptr 0x39424980, size 128000.
output 5, ptr 0x394a1a00, size 51200.
output 6, ptr 0x394d3ac0, size 25600.
output 7, ptr 0x394ecb40, size 32000.
output 8, ptr 0x3950bfc0, size 12800.
output 9, ptr 0x39518840, size 819200.
post process time : 51 ms
detection num: 3
1: 94%, [ 126, 124, 568, 420], bicycle
16: 95%, [ 132, 222, 311, 541], dog
2: 82%, [ 467, 76, 692, 172], car
destroy npu finished.
~NpuUint.
This performance data only calculates the time consumption of model inference. Unless otherwise specified, it does not include the time consumption of pre-processing and post-processing.
| SoC | NPU | Model | Input Resolution | Network Creation Time | Network Preparation Time | Single Frame Inference Time | Post-processing Time | Total Time | Frame Rate |
|---|---|---|---|---|---|---|---|---|---|
| Allwinner T527 | Vivante VIP9000 | yolo11s-seg | 640×640 | 26.2 ms | 11.8 ms | 94.1 ms | 51.0 ms | 183.1 ms | 10.6 FPS |