跳到主要内容

QAI Hub Models

Qualcomm® AI Hub Models(QAI-Hub-Models) 基于 QAI-Hub 提供的云服务,支持以命令行方式将模型列表中的模型在云设备上进行在线量化、编译、推理、分析和下载

使用方法

安装 qai_hub_models

Device
pip3 install qai_hub_models

配置 API Token

提示

请先在 Qualcomm® AI Hub 上进行账户注册并登陆,获取用户 API Token

Device
qai-hub configure --api_token API_TOKEN

使用示例

QAI HUB MODELS 模型编译示例

这里以编译 RealESRGAN_x4plus 为可以使用 NPU 进行推理的输入大小为 128x128 w8a8 量化 的 Context-Binary 模型格式为例子, 简单介绍使用 Qualcomm® AI Hub Models 进行模型编译的使用方法。

Device
export PRODUCT_CHIP=qualcomm-sc8280xp-proxy
Device
python3 -m qai_hub_models.models.real_esrgan_x4plus.export --chipset ${PRODUCT_CHIP} --target-runtime qnn_context_binary  --height 128 --width 128 --quantize w8a8 --num-calibration-samples 10

--chipset 指定目标运行的芯片

--target-runtime 指定目标运行时

--height 目标模型输入高度

--width 目标模型输入宽度

--quantize 指定量化方式

--num-calibration-samples 指定量化校准集图片数量

提示

qai_hub_models.models.real_esrgan_x4plus.export 的详细使用方法请使用 --help 查看

模型编译的所有信息和日志可以在 Qualcomm® AI Hub 里的 JOBS 进行查看

(.venv) (base) radxa@vms-max:/mnt/sda1/qualcomm/qai-hub/ai_hub_model$ python3 -m qai_hub_models.models.real_esrgan_x4plus.export --chipset "qualcomm-qcs6490-proxy" --target-runtime qnn_context_binary  --height 128 --width 128 --quantize w8a8 --num-calibration-samples 10
Quantizing model real_esrgan_x4plus.
Uploading tmplaenxfnu.pt
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 64.8M/64.8M [00:06<00:00, 11.1MB/s]
Scheduled compile job (jgon7zqkp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgon7zqkp/

Loading 10 calibration samples.
Waiting for compile job (jgon7zqkp) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Uploading dataset: 700kB [00:01, 522kB/s]
Scheduled quantize job (jpew0o3vp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jpew0o3vp/

Waiting for quantize job (jpew0o3vp) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Optimizing model real_esrgan_x4plus to run on-device
Scheduled compile job (jgdqynlz5) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgdqynlz5/

Profiling model real_esrgan_x4plus on a hosted device.
Waiting for compile job (jgdqynlz5) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Scheduled profile job (jp4d6n01p) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jp4d6n01p/

Running inference for real_esrgan_x4plus on a hosted device with example inputs.
Downloading data at https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/super_resolution/v2/super_resolution_input.jpg to /home/zifeng/.qaihm/models/super_resolution/v2/super_resolution_input.jpg
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 16.5k/16.5k [00:00<00:00, 92.2MB/s]
Done
Uploading dataset: 104kB [00:01, 100kB/s]
Scheduled inference job (jpx6892lp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jpx6892lp/

real_esrgan_x4plus.bin: 100%|██████████████████████████████████████████████████████████████████████████| 21.5M/21.5M [00:02<00:00, 7.97MB/s]
Downloaded model to /mnt/sda1/qualcomm/qai-hub/ai_hub_model/build/real_esrgan_x4plus/real_esrgan_x4plus.bin
Waiting for profile job (jp4d6n01p) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS

------------------------------------------------------------
Performance results on-device for Real_Esrgan_X4Plus.
------------------------------------------------------------
Device : QCS6490 (Proxy) (ANDROID 12)
Runtime : QNN_CONTEXT_BINARY
Estimated inference time (ms) : 171.8
Estimated peak memory usage (MB): [0, 13]
Total # Ops : 1027
Compute Unit(s) : npu (1027 ops) gpu (0 ops) cpu (0 ops)
------------------------------------------------------------
More details: https://app.aihub.qualcomm.com/jobs/jp4d6n01p/

tmpz9r34tur.h5: 100%|███████████████████████████████████████████████████████████████████████████████████| 1.05M/1.05M [00:03<00:00, 357kB/s]

Comparing on-device vs. local-cpu inference for Real_Esrgan_X4Plus.
+----------------+------------------+--------+
| output_name | shape | psnr |
+================+==================+========+
| upscaled_image | (1, 512, 512, 3) | 24.49 |
+----------------+------------------+--------+

- psnr: Peak Signal-to-Noise Ratio (PSNR). >30 dB is typically considered good.

More details: https://app.aihub.qualcomm.com/jobs/jpx6892lp/

Run compiled model on a hosted device on sample data using:
python /mnt/sda1/qualcomm/qai-hub/.venv/lib/python3.10/site-packages/qai_hub_models/models/real_esrgan_x4plus/demo.py --eval-mode on-device --hub-model-id mm63gld2m --chipset qualcomm-qcs6490-proxy

QAI HUB MODELS 模型推理 Demo

根据用户模型编译示例中最后的打引提示,可以在云端设备上推理编译后模型并查看推理结果

提示

请修改 hub-model-id 参数为模型编译结果最后打印的 hub-model-id

hub-model-id 具体参数位置

Device
python3 -m qai_hub_models.models.real_esrgan_x4plus.demo --eval-mode on-device --hub-model-id mm63gld2m --chipset ${PRODUCT_CHIP}
(.venv) (gen_py3.10) radxa@vms-max:~/Job/git_clone/ai-hub-models$ python3 -m qai_hub_models.models.real_esrgan_x4plus.demo --eval-mode on-device --hub-model-id mm63gld2m --chipset qualcomm-qcs6490-proxy
/mnt/sda1/git_clone/ai-hub-models/.venv/lib/python3.10/site-packages/qai_hub_models/utils/onnx_torch_wrapper.py:22: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Model Loaded
Uploading dataset: 104kB [00:02, 52.5kB/s]
Scheduled inference job (jgj29j8e5) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgj29j8e5/

Waiting for inference job (jgj29j8e5) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
tmpe9b4q_c2.h5: 100%|██████████████████████████████████████████████████████████████████████████████████| 1.03M/1.03M [00:00<00:00, 2.04MB/s]
Displaying original image
Displaying upscaled image

左边为结果图,右边为输入图

模型列表

Computer Vision

ModelREADME
Image Classification
Beitqai_hub_models.models.beit
ConvNext-Baseqai_hub_models.models.convnext_base
ConvNext-Tinyqai_hub_models.models.convnext_tiny
DLA-102-Xqai_hub_models.models.dla102x
DenseNet-121qai_hub_models.models.densenet121
EfficientFormerqai_hub_models.models.efficientformer
EfficientNet-B0qai_hub_models.models.efficientnet_b0
EfficientNet-B4qai_hub_models.models.efficientnet_b4
EfficientNet-V2-sqai_hub_models.models.efficientnet_v2_s
EfficientViT-b2-clsqai_hub_models.models.efficientvit_b2_cls
EfficientViT-l2-clsqai_hub_models.models.efficientvit_l2_cls
GoogLeNetqai_hub_models.models.googlenet
Inception-v3qai_hub_models.models.inception_v3
LeViTqai_hub_models.models.levit
MNASNet05qai_hub_models.models.mnasnet05
Mobile-VITqai_hub_models.models.mobile_vit
MobileNet-v2qai_hub_models.models.mobilenet_v2
MobileNet-v3-Largeqai_hub_models.models.mobilenet_v3_large
MobileNet-v3-Smallqai_hub_models.models.mobilenet_v3_small
NASNetqai_hub_models.models.nasnet
RegNetqai_hub_models.models.regnet
ResNeXt101qai_hub_models.models.resnext101
ResNeXt50qai_hub_models.models.resnext50
ResNet101qai_hub_models.models.resnet101
ResNet18qai_hub_models.models.resnet18
ResNet50qai_hub_models.models.resnet50
Sequencer2Dqai_hub_models.models.sequencer2d
Shufflenet-v2qai_hub_models.models.shufflenet_v2
SqueezeNet-1.1qai_hub_models.models.squeezenet1_1
Swin-Baseqai_hub_models.models.swin_base
Swin-Smallqai_hub_models.models.swin_small
Swin-Tinyqai_hub_models.models.swin_tiny
VITqai_hub_models.models.vit
WideResNet50qai_hub_models.models.wideresnet50
Image Editing
AOT-GANqai_hub_models.models.aotgan
LaMa-Dilatedqai_hub_models.models.lama_dilated
Image Generation
Simple-Bevqai_hub_models.models.simple_bev_cam
Super Resolution
ESRGANqai_hub_models.models.esrgan
QuickSRNetLargeqai_hub_models.models.quicksrnetlarge
QuickSRNetMediumqai_hub_models.models.quicksrnetmedium
QuickSRNetSmallqai_hub_models.models.quicksrnetsmall
Real-ESRGAN-General-x4v3qai_hub_models.models.real_esrgan_general_x4v3
Real-ESRGAN-x4plusqai_hub_models.models.real_esrgan_x4plus
SESR-M5qai_hub_models.models.sesr_m5
XLSRqai_hub_models.models.xlsr
Semantic Segmentation
BGNetqai_hub_models.models.bgnet
BiseNetqai_hub_models.models.bisenet
DDRNet23-Slimqai_hub_models.models.ddrnet23_slim
DeepLabV3-Plus-MobileNetqai_hub_models.models.deeplabv3_plus_mobilenet
DeepLabV3-ResNet50qai_hub_models.models.deeplabv3_resnet50
DeepLabXceptionqai_hub_models.models.deeplab_xception
EfficientViT-l2-segqai_hub_models.models.efficientvit_l2_seg
FCN-ResNet50qai_hub_models.models.fcn_resnet50
FFNet-122NS-LowResqai_hub_models.models.ffnet_122ns_lowres
FFNet-40Sqai_hub_models.models.ffnet_40s
FFNet-54Sqai_hub_models.models.ffnet_54s
FFNet-78Sqai_hub_models.models.ffnet_78s
FFNet-78S-LowResqai_hub_models.models.ffnet_78s_lowres
FastSam-Sqai_hub_models.models.fastsam_s
FastSam-Xqai_hub_models.models.fastsam_x
HRNet-W48-OCRqai_hub_models.models.hrnet_w48_ocr
Mask2Formerqai_hub_models.models.mask2former
MediaPipe-Selfie-Segmentationqai_hub_models.models.mediapipe_selfie
MobileSamqai_hub_models.models.mobilesam
PidNetqai_hub_models.models.pidnet
SINetqai_hub_models.models.sinet
SalsaNextqai_hub_models.models.salsanext
Segformer-Baseqai_hub_models.models.segformer_base
Segment-Anything-Model-2qai_hub_models.models.sam2
Unet-Segmentationqai_hub_models.models.unet_segmentation
YOLOv11-Segmentationqai_hub_models.models.yolov11_seg
YOLOv8-Segmentationqai_hub_models.models.yolov8_seg
Video Classification
ResNet-2Plus1Dqai_hub_models.models.resnet_2plus1d
ResNet-3Dqai_hub_models.models.resnet_3d
ResNet-Mixed-Convolutionqai_hub_models.models.resnet_mixed
Video-MAEqai_hub_models.models.video_mae
Video Generation
First-Order-Motion-Modelqai_hub_models.models.fomm
Object Detection
3D-Deep-BOXqai_hub_models.models.deepbox
Conditional-DETR-ResNet50qai_hub_models.models.conditional_detr_resnet50
DETR-ResNet101qai_hub_models.models.detr_resnet101
DETR-ResNet101-DC5qai_hub_models.models.detr_resnet101_dc5
DETR-ResNet50qai_hub_models.models.detr_resnet50
DETR-ResNet50-DC5qai_hub_models.models.detr_resnet50_dc5
Facial-Attribute-Detectionqai_hub_models.models.face_attrib_net
Lightweight-Face-Detectionqai_hub_models.models.face_det_lite
MediaPipe-Face-Detectionqai_hub_models.models.mediapipe_face
MediaPipe-Hand-Detectionqai_hub_models.models.mediapipe_hand
PPE-Detectionqai_hub_models.models.gear_guard_net
Person-Foot-Detectionqai_hub_models.models.foot_track_net
RF-DETRqai_hub_models.models.rf_detr
RTMDetqai_hub_models.models.rtmdet
YOLOv10-Detectionqai_hub_models.models.yolov10_det
YOLOv11-Detectionqai_hub_models.models.yolov11_det
YOLOv8-Detectionqai_hub_models.models.yolov8_det
Yolo-Xqai_hub_models.models.yolox
Yolo-v3qai_hub_models.models.yolov3
Yolo-v5qai_hub_models.models.yolov5
Yolo-v6qai_hub_models.models.yolov6
Yolo-v7qai_hub_models.models.yolov7
Pose Estimation
Facial-Landmark-Detectionqai_hub_models.models.facemap_3dmm
HRNetPoseqai_hub_models.models.hrnet_pose
LiteHRNetqai_hub_models.models.litehrnet
MediaPipe-Pose-Estimationqai_hub_models.models.mediapipe_pose
Movenetqai_hub_models.models.movenet
Posenet-Mobilenetqai_hub_models.models.posenet_mobilenet
RTMPose-Body2dqai_hub_models.models.rtmpose_body2d
Depth Estimation
Depth-Anythingqai_hub_models.models.depth_anything
Depth-Anything-V2qai_hub_models.models.depth_anything_v2
Midas-V2qai_hub_models.models.midas

Multimodal

ModelREADME
EasyOCRqai_hub_models.models.easyocr
Nomic-Embed-Textqai_hub_models.models.nomic_embed_text
OpenAI-Clipqai_hub_models.models.openai_clip
TrOCRqai_hub_models.models.trocr

Audio

ModelREADME
Speech Recognition
HuggingFace-WavLM-Base-Plusqai_hub_models.models.huggingface_wavlm_base_plus
Whisper-Baseqai_hub_models.models.whisper_base
Whisper-Large-V3-Turboqai_hub_models.models.whisper_large_v3_turbo
Whisper-Smallqai_hub_models.models.whisper_small
Whisper-Tinyqai_hub_models.models.whisper_tiny
Audio Classification
YamNetqai_hub_models.models.yamnet

Generative AI

ModelREADME
Image Generation
ControlNet-Cannyqai_hub_models.models.controlnet_canny
Stable-Diffusion-v1.5qai_hub_models.models.stable_diffusion_v1_5
Stable-Diffusion-v2.1qai_hub_models.models.stable_diffusion_v2_1
Text Generation
ALLaM-7Bqai_hub_models.models.allam_7b
Baichuan2-7Bqai_hub_models.models.baichuan2_7b
Falcon3-7B-Instructqai_hub_models.models.falcon_v3_7b_instruct
IBM-Granite-v3.1-8B-Instructqai_hub_models.models.ibm_granite_v3_1_8b_instruct
IndusQ-1.1Bqai_hub_models.models.indus_1b
JAIS-6p7b-Chatqai_hub_models.models.jais_6p7b_chat
Llama-SEA-LION-v3.5-8B-Rqai_hub_models.models.llama_v3_1_sea_lion_3_5_8b_r
Llama-v2-7B-Chatqai_hub_models.models.llama_v2_7b_chat
Llama-v3-8B-Instructqai_hub_models.models.llama_v3_8b_instruct
Llama-v3.1-8B-Instructqai_hub_models.models.llama_v3_1_8b_instruct
Llama-v3.2-1B-Instructqai_hub_models.models.llama_v3_2_1b_instruct
Llama-v3.2-3B-Instructqai_hub_models.models.llama_v3_2_3b_instruct
Llama3-TAIDE-LX-8B-Chat-Alpha1qai_hub_models.models.llama_v3_taide_8b_chat
Ministral-3Bqai_hub_models.models.ministral_3b
Mistral-3Bqai_hub_models.models.mistral_3b
Mistral-7B-Instruct-v0.3qai_hub_models.models.mistral_7b_instruct_v0_3
PLaMo-1Bqai_hub_models.models.plamo_1b
Phi-3.5-Mini-Instructqai_hub_models.models.phi_3_5_mini_instruct
Qwen2-7B-Instructqai_hub_models.models.qwen2_7b_instruct
Qwen2.5-7B-Instructqai_hub_models.models.qwen2_5_7b_instruct

    您需要登录 GitHub 才能发表评论。如果您已登录,请忽略此消息。

    Radxa-docs © 2026 by Radxa Computer (Shenzhen) Co.,Ltd. is licensed under CC BY 4.0