Performance Benchmark

Performance benchmark is the best way to understand the performance of the hardware platform.

info

The benchmark test results may differ due to specific application scenarios and model optimization levels, and are for reference only.

Test Description

Test tool: axcl_run_model
Batch Size: 1 or 8
Unit: FPS (Frame/Second)

info

Note: Due to performance differences in memory copy (memcopy) and PCIe across different host platforms, axcl_run_model only measures the inference time of the neural network model on the Device side (excluding host-side operations).

Vision Model

Vision Model	Input Size	Batch 1(IPS)	Batch 8(IPS)
Inceptionv1	224	1073	2494
Inceptionv3	224	478	702
MobileNetv1	224	1508	4854
MobileNetv2	224	1366	5073
ResNet18	224	1066	2254
ResNet50	224	576	1045
SqueezeNet11	224	1560	5961
Swin-T	224	342	507
ViT-B/16	224	162	207
YOLOv5s	640	326	394
YOLOv6s	640	282	322
YOLOv8s	640	248	279
YOLOv9s	640	237
YOLOv10s	640	298
YOLOv11n	640	860
YOLOv11s	640	305
YOLOv11m	640	114
YOLOv11l	640	87
YOLOv11x	640	41

Audio Model

Audio Model	RTF
Whisper-Tiny	0.03
Whisper-Small	0.18
MeloTTS	0.04

LLM

LLM	Prompt length(tokens)	TTFT(ms)	Generate(tokens/s)
Qwen2.5-0.5B	128	188	28

VLM

VLM	Input Image	Image Encoder (ms)	Prompt length (tokens)	TTFT (ms)	Generate (tokens/s)
InternVL2-1B	448*448	4200	320	425	29

Test Description​

Vision Model​

Audio Model​

LLM​

VLM​

Test Description

Vision Model

Audio Model

LLM

VLM