Skip to main content

Performance Benchmark

Benchmarking is the best way to understand how fast neural network models run on the hardware platform.

info

Benchmark results may vary depending on the specific application scenario and the degree of model optimization. The data below is for reference only.

Test notes

  • Test tool: axcl_run_model
  • Batch size: 1 or 8
  • Unit: FPS (Frames/Second)
info

Due to differences in memcopy/PCIe performance across hosts, axcl_run_model only measures the inference time on the device.

Vision models

Vision modelInput sizeFPS (single)FPS (batch=8)
Inceptionv122410732494
Inceptionv3224478702
MobileNetv122415084854
MobileNetv222413665073
ResNet1822410662254
ResNet502245761045
SqueezeNet1122415605961
Swin-T224342507
ViT-B/16224162207
YOLOv5s640326394
YOLOv6s640282322
YOLOv8s640248279
YOLOv9s640237
YOLOv10s640298
YOLOv11n640860
YOLOv11s640305
YOLOv11m640114
YOLOv11l64087
YOLOv11x64041

Audio models

Audio modelReal-time factor
Whisper-Tiny0.03
Whisper-Small0.18
MeloTTS0.04

Large language models

ModelPrompt length (tokens)First token latency (ms)Generation speed (tokens/s)
Qwen2.5-0.5B12818828

Vision-language models

ModelInput sizeImage encoder time (ms)Prompt length (tokens)First token latency (ms)Generation speed (tokens/s)
InternVL2-1B448*448420032042529

    You need to be logged into GitHub to post a comment. If you are already logged in, please ignore this message.

    Radxa-docs © 2026 by Radxa Computer (Shenzhen) Co.,Ltd. is licensed under CC BY 4.0