Skip to main content

Fast-SCNN

Fast-SCNN is a lightweight convolutional neural network designed for real-time semantic segmentation on high-resolution images. It adopts an innovative multi-branch architecture. By sharing feature extraction modules and using a lightweight design, it alleviates the heavy compute pressure of traditional segmentation models when processing large images.

  • Key features: Focuses on pixel-level real-time semantic segmentation, enabling low-latency class labeling for complex scenes. It is widely used in areas with strict responsiveness requirements such as autonomous driving, mobile AR, and robot obstacle avoidance.
  • Version notes: This example uses Fast-SCNN. With a unique “learning to downsample” module combined with global feature extraction, it greatly improves inference efficiency without sacrificing key spatial details. It reduces reliance on high-end GPUs and is a common lightweight choice for high-resolution real-time image understanding on embedded devices.
Environment setup

You need to set up the environment in advance.

Quick start

Download model files

O6 / O6N
cd ai_model_hub_25_Q3/models/ComputeVision/Semantic_Segmentation/torch_fast_scnn
wget https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/ComputeVision/Semantic_Segmentation/torch_fast_scnn/fast_scnn.cix

Test the model

info

Activate the virtual environment before running.

O6 / O6N
python3 inference_npu.py

Full conversion workflow

Project structure

├── cfg
├── datasets
├── fast_scnn.cix
├── inference_npu.py
├── inference_pt.py
├── model
├── ReadMe.md
└── test_data

Quantize and convert the model

Linux PC
cd ..
cixbuild cfg/fast_scnnbuild.cfg
Copy to device

After conversion, copy the .cix model files to the device.

Test inference on the host

Run the inference script

Linux PC
python3 inference_pt.py

Inference output

Deploy on NPU

Run the inference script

O6 / O6N
python3 inference_npu.py

Inference output

O6 / O6N
$ python inference_npu.py
npu: noe_init_context success
npu: noe_load_graph success
Input tensor count is 1.
Output tensor count is 1.
npu: noe_create_job success
npu: noe_clean_job success
npu: noe_unload_graph success
npu: noe_deinit_context success

    You need to be logged into GitHub to post a comment. If you are already logged in, please ignore this message.

    Radxa-docs © 2026 by Radxa Computer (Shenzhen) Co.,Ltd. is licensed under CC BY 4.0