Skip to main content

Parallel AI Fusion

gst-ai-parallel-inference runs multiple AI models simultaneously on a single video stream, with all models inferring in parallel on the NPU and results fused into a single display.

The default configuration runs 4 models concurrently: object detection (YOLOX), classification (InceptionV3), pose detection (HRNet), and segmentation (DeepLabV3+), all executing in parallel on DSP.

Prerequisites

Steps

1. Verify Models

ls -l /etc/models/{yolox_quantized,inception_v3_quantized,hrnet_pose_quantized,deeplabv3_plus_mobilenet_quantized}.tflite

2. View Configuration

cat /etc/configs/config-parallel-inference.json

3. Run

radxa@airbox$
gst-ai-parallel-inference --config-file=/etc/configs/config-parallel-inference.json

Press Ctrl + C to stop.

Expected Output

VERBOSE: Replacing 329 out of 329 node(s) with delegate (TfLiteQnnDelegate) node
VERBOSE: Replacing 142 out of 142 node(s) with delegate (TfLiteQnnDelegate) node
VERBOSE: Replacing 518 out of 518 node(s) with delegate (TfLiteQnnDelegate) node
VERBOSE: Replacing 136 out of 136 node(s) with delegate (TfLiteQnnDelegate) node
Pipeline state changed from PAUSED to PLAYING

The display simultaneously shows bounding boxes, classification labels, human skeletons, and segmentation masks — all four AI results overlaid on the same frame.

Validation

  • 4 models (329 + 142 + 518 + 136 = 1,125 operators) running in parallel on DSP
  • Pipeline reaches PLAYING state
  • Display shows all four inference results simultaneously

This demonstrates that the Q900 HTP V73 NPU supports concurrent multi-model processing.

    You need to be logged into GitHub to post a comment. If you are already logged in, please ignore this message.

    Radxa-docs © 2026 by Radxa Computer (Shenzhen) Co.,Ltd. is licensed under CC BY 4.0