Skip to main content

QAI Hub Models

Qualcomm® AI Hub Models (QAI-Hub-Models) Leveraging the cloud services provided by QAI-Hub, it supports quantization, compilation, inference, analysis, and downloading of models from the model list on cloud devices via command line.

Usage Guide

Install qai_hub_models

Device
pip3 install qai_hub_models

Configure API Token

tip

First, register and log in to Qualcomm® AI Hub to obtain your API Token

Device
qai-hub configure --api_token API_TOKEN

Usage Examples

QAI HUB MODELS Model Compilation Example

This example demonstrates compiling RealESRGAN_x4plus into a Context-Binary model format with 128x128 input size and w8a8 quantization for NPU inference, providing a simple introduction to using Qualcomm® AI Hub Models for model compilation.

Device
export PRODUCT_CHIP=qualcomm-sc8280xp-proxy
Device
python3 -m qai_hub_models.models.real_esrgan_x4plus.export --chipset ${PRODUCT_CHIP} --target-runtime qnn_context_binary  --height 128 --width 128 --quantize w8a8 --num-calibration-samples 10

--chipset Specifies the target chipset

--target-runtime Specifies the target runtime

--height Input height for the target model

--width Input width for the target model

--quantize Specifies the quantization method

--num-calibration-samples Specifies the number of calibration images for quantization

tip

For detailed usage of qai_hub_models.models.real_esrgan_x4plus.export, use the --help flag

All model compilation information and logs can be viewed in the JOBS section of Qualcomm® AI Hub

(.venv) (base) radxa@vms-max:/mnt/sda1/qualcomm/qai-hub/ai_hub_model$ python3 -m qai_hub_models.models.real_esrgan_x4plus.export --chipset "qualcomm-qcs6490-proxy" --target-runtime qnn_context_binary  --height 128 --width 128 --quantize w8a8 --num-calibration-samples 10
Quantizing model real_esrgan_x4plus.
Uploading tmplaenxfnu.pt
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 64.8M/64.8M [00:06<00:00, 11.1MB/s]
Scheduled compile job (jgon7zqkp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgon7zqkp/

Loading 10 calibration samples.
Waiting for compile job (jgon7zqkp) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Uploading dataset: 700kB [00:01, 522kB/s]
Scheduled quantize job (jpew0o3vp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jpew0o3vp/

Waiting for quantize job (jpew0o3vp) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Optimizing model real_esrgan_x4plus to run on-device
Scheduled compile job (jgdqynlz5) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgdqynlz5/

Profiling model real_esrgan_x4plus on a hosted device.
Waiting for compile job (jgdqynlz5) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Scheduled profile job (jp4d6n01p) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jp4d6n01p/

Running inference for real_esrgan_x4plus on a hosted device with example inputs.
Downloading data at https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/super_resolution/v2/super_resolution_input.jpg to /home/zifeng/.qaihm/models/super_resolution/v2/super_resolution_input.jpg
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 16.5k/16.5k [00:00<00:00, 92.2MB/s]
Done
Uploading dataset: 104kB [00:01, 100kB/s]
Scheduled inference job (jpx6892lp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jpx6892lp/

real_esrgan_x4plus.bin: 100%|██████████████████████████████████████████████████████████████████████████| 21.5M/21.5M [00:02<00:00, 7.97MB/s]
Downloaded model to /mnt/sda1/qualcomm/qai-hub/ai_hub_model/build/real_esrgan_x4plus/real_esrgan_x4plus.bin
Waiting for profile job (jp4d6n01p) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS

------------------------------------------------------------
Performance results on-device for Real_Esrgan_X4Plus.
------------------------------------------------------------
Device : QCS6490 (Proxy) (ANDROID 12)
Runtime : QNN_CONTEXT_BINARY
Estimated inference time (ms) : 171.8
Estimated peak memory usage (MB): [0, 13]
Total # Ops : 1027
Compute Unit(s) : npu (1027 ops) gpu (0 ops) cpu (0 ops)
------------------------------------------------------------
More details: https://app.aihub.qualcomm.com/jobs/jp4d6n01p/

tmpz9r34tur.h5: 100%|███████████████████████████████████████████████████████████████████████████████████| 1.05M/1.05M [00:03<00:00, 357kB/s]

Comparing on-device vs. local-cpu inference for Real_Esrgan_X4Plus.
+----------------+------------------+--------+
| output_name | shape | psnr |
+================+==================+========+
| upscaled_image | (1, 512, 512, 3) | 24.49 |
+----------------+------------------+--------+

- psnr: Peak Signal-to-Noise Ratio (PSNR). >30 dB is typically considered good.

More details: https://app.aihub.qualcomm.com/jobs/jpx6892lp/

Run compiled model on a hosted device on sample data using:
python /mnt/sda1/qualcomm/qai-hub/.venv/lib/python3.10/site-packages/qai_hub_models/models/real_esrgan_x4plus/demo.py --eval-mode on-device --hub-model-id mm63gld2m --chipset qualcomm-qcs6490-proxy

QAI HUB MODELS Inference Demo

Based on the instructions at the end of the model compilation example, you can run inference on the compiled model on cloud devices and view the results

tip

Modify the hub-model-id parameter to match the hub-model-id printed at the end of the model compilation results

Location of the hub-model-id parameter

Device
python3 -m qai_hub_models.models.real_esrgan_x4plus.demo --eval-mode on-device --hub-model-id mm63gld2m --chipset ${PRODUCT_CHIP}
(.venv) (gen_py3.10) radxa@vms-max:~/Job/git_clone/ai-hub-models$ python3 -m qai_hub_models.models.real_esrgan_x4plus.demo --eval-mode on-device --hub-model-id mm63gld2m --chipset qualcomm-qcs6490-proxy
/mnt/sda1/git_clone/ai-hub-models/.venv/lib/python3.10/site-packages/qai_hub_models/utils/onnx_torch_wrapper.py:22: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Model Loaded
Uploading dataset: 104kB [00:02, 52.5kB/s]
Scheduled inference job (jgj29j8e5) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgj29j8e5/

Waiting for inference job (jgj29j8e5) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
tmpe9b4q_c2.h5: 100%|██████████████████████████████████████████████████████████████████████████████████| 1.03M/1.03M [00:00<00:00, 2.04MB/s]
Displaying original image
Displaying upscaled image

The left side shows the result image, and the right side shows the input image

Model List

Computer Vision

ModelREADME
Image Classification
Beitqai_hub_models.models.beit
ConvNext-Baseqai_hub_models.models.convnext_base
ConvNext-Tinyqai_hub_models.models.convnext_tiny
DLA-102-Xqai_hub_models.models.dla102x
DenseNet-121qai_hub_models.models.densenet121
EfficientFormerqai_hub_models.models.efficientformer
EfficientNet-B0qai_hub_models.models.efficientnet_b0
EfficientNet-B4qai_hub_models.models.efficientnet_b4
EfficientNet-V2-sqai_hub_models.models.efficientnet_v2_s
EfficientViT-b2-clsqai_hub_models.models.efficientvit_b2_cls
EfficientViT-l2-clsqai_hub_models.models.efficientvit_l2_cls
GoogLeNetqai_hub_models.models.googlenet
Inception-v3qai_hub_models.models.inception_v3
LeViTqai_hub_models.models.levit
MNASNet05qai_hub_models.models.mnasnet05
Mobile-VITqai_hub_models.models.mobile_vit
MobileNet-v2qai_hub_models.models.mobilenet_v2
MobileNet-v3-Largeqai_hub_models.models.mobilenet_v3_large
MobileNet-v3-Smallqai_hub_models.models.mobilenet_v3_small
NASNetqai_hub_models.models.nasnet
RegNetqai_hub_models.models.regnet
ResNeXt101qai_hub_models.models.resnext101
ResNeXt50qai_hub_models.models.resnext50
ResNet101qai_hub_models.models.resnet101
ResNet18qai_hub_models.models.resnet18
ResNet50qai_hub_models.models.resnet50
Sequencer2Dqai_hub_models.models.sequencer2d
Shufflenet-v2qai_hub_models.models.shufflenet_v2
SqueezeNet-1.1qai_hub_models.models.squeezenet1_1
Swin-Baseqai_hub_models.models.swin_base
Swin-Smallqai_hub_models.models.swin_small
Swin-Tinyqai_hub_models.models.swin_tiny
VITqai_hub_models.models.vit
WideResNet50qai_hub_models.models.wideresnet50
Image Editing
AOT-GANqai_hub_models.models.aotgan
LaMa-Dilatedqai_hub_models.models.lama_dilated
Image Generation
Simple-Bevqai_hub_models.models.simple_bev_cam
Super Resolution
ESRGANqai_hub_models.models.esrgan
QuickSRNetLargeqai_hub_models.models.quicksrnetlarge
QuickSRNetMediumqai_hub_models.models.quicksrnetmedium
QuickSRNetSmallqai_hub_models.models.quicksrnetsmall
Real-ESRGAN-General-x4v3qai_hub_models.models.real_esrgan_general_x4v3
Real-ESRGAN-x4plusqai_hub_models.models.real_esrgan_x4plus
SESR-M5qai_hub_models.models.sesr_m5
XLSRqai_hub_models.models.xlsr
Semantic Segmentation
BGNetqai_hub_models.models.bgnet
BiseNetqai_hub_models.models.bisenet
DDRNet23-Slimqai_hub_models.models.ddrnet23_slim
DeepLabV3-Plus-MobileNetqai_hub_models.models.deeplabv3_plus_mobilenet
DeepLabV3-ResNet50qai_hub_models.models.deeplabv3_resnet50
DeepLabXceptionqai_hub_models.models.deeplab_xception
EfficientViT-l2-segqai_hub_models.models.efficientvit_l2_seg
FCN-ResNet50qai_hub_models.models.fcn_resnet50
FFNet-122NS-LowResqai_hub_models.models.ffnet_122ns_lowres
FFNet-40Sqai_hub_models.models.ffnet_40s
FFNet-54Sqai_hub_models.models.ffnet_54s
FFNet-78Sqai_hub_models.models.ffnet_78s
FFNet-78S-LowResqai_hub_models.models.ffnet_78s_lowres
FastSam-Sqai_hub_models.models.fastsam_s
FastSam-Xqai_hub_models.models.fastsam_x
HRNet-W48-OCRqai_hub_models.models.hrnet_w48_ocr
Mask2Formerqai_hub_models.models.mask2former
MediaPipe-Selfie-Segmentationqai_hub_models.models.mediapipe_selfie
MobileSamqai_hub_models.models.mobilesam
PidNetqai_hub_models.models.pidnet
SINetqai_hub_models.models.sinet
SalsaNextqai_hub_models.models.salsanext
Segformer-Baseqai_hub_models.models.segformer_base
Segment-Anything-Model-2qai_hub_models.models.sam2
Unet-Segmentationqai_hub_models.models.unet_segmentation
YOLOv11-Segmentationqai_hub_models.models.yolov11_seg
YOLOv8-Segmentationqai_hub_models.models.yolov8_seg
Video Classification
ResNet-2Plus1Dqai_hub_models.models.resnet_2plus1d
ResNet-3Dqai_hub_models.models.resnet_3d
ResNet-Mixed-Convolutionqai_hub_models.models.resnet_mixed
Video-MAEqai_hub_models.models.video_mae
Video Generation
First-Order-Motion-Modelqai_hub_models.models.fomm
Object Detection
3D-Deep-BOXqai_hub_models.models.deepbox
Conditional-DETR-ResNet50qai_hub_models.models.conditional_detr_resnet50
DETR-ResNet101qai_hub_models.models.detr_resnet101
DETR-ResNet101-DC5qai_hub_models.models.detr_resnet101_dc5
DETR-ResNet50qai_hub_models.models.detr_resnet50
DETR-ResNet50-DC5qai_hub_models.models.detr_resnet50_dc5
Facial-Attribute-Detectionqai_hub_models.models.face_attrib_net
Lightweight-Face-Detectionqai_hub_models.models.face_det_lite
MediaPipe-Face-Detectionqai_hub_models.models.mediapipe_face
MediaPipe-Hand-Detectionqai_hub_models.models.mediapipe_hand
PPE-Detectionqai_hub_models.models.gear_guard_net
Person-Foot-Detectionqai_hub_models.models.foot_track_net
RF-DETRqai_hub_models.models.rf_detr
RTMDetqai_hub_models.models.rtmdet
YOLOv10-Detectionqai_hub_models.models.yolov10_det
YOLOv11-Detectionqai_hub_models.models.yolov11_det
YOLOv8-Detectionqai_hub_models.models.yolov8_det
Yolo-Xqai_hub_models.models.yolox
Yolo-v3qai_hub_models.models.yolov3
Yolo-v5qai_hub_models.models.yolov5
Yolo-v6qai_hub_models.models.yolov6
Yolo-v7qai_hub_models.models.yolov7
Pose Estimation
Facial-Landmark-Detectionqai_hub_models.models.facemap_3dmm
HRNetPoseqai_hub_models.models.hrnet_pose
LiteHRNetqai_hub_models.models.litehrnet
MediaPipe-Pose-Estimationqai_hub_models.models.mediapipe_pose
Movenetqai_hub_models.models.movenet
Posenet-Mobilenetqai_hub_models.models.posenet_mobilenet
RTMPose-Body2dqai_hub_models.models.rtmpose_body2d
Depth Estimation
Depth-Anythingqai_hub_models.models.depth_anything
Depth-Anything-V2qai_hub_models.models.depth_anything_v2
Midas-V2qai_hub_models.models.midas

Multimodal

ModelREADME
EasyOCRqai_hub_models.models.easyocr
Nomic-Embed-Textqai_hub_models.models.nomic_embed_text
OpenAI-Clipqai_hub_models.models.openai_clip
TrOCRqai_hub_models.models.trocr

Audio

ModelREADME
Speech Recognition
HuggingFace-WavLM-Base-Plusqai_hub_models.models.huggingface_wavlm_base_plus
Whisper-Baseqai_hub_models.models.whisper_base
Whisper-Large-V3-Turboqai_hub_models.models.whisper_large_v3_turbo
Whisper-Smallqai_hub_models.models.whisper_small
Whisper-Tinyqai_hub_models.models.whisper_tiny
Audio Classification
YamNetqai_hub_models.models.yamnet

Generative AI

ModelREADME
Image Generation
ControlNet-Cannyqai_hub_models.models.controlnet_canny
Stable-Diffusion-v1.5qai_hub_models.models.stable_diffusion_v1_5
Stable-Diffusion-v2.1qai_hub_models.models.stable_diffusion_v2_1
Text Generation
ALLaM-7Bqai_hub_models.models.allam_7b
Baichuan2-7Bqai_hub_models.models.baichuan2_7b
Falcon3-7B-Instructqai_hub_models.models.falcon_v3_7b_instruct
IBM-Granite-v3.1-8B-Instructqai_hub_models.models.ibm_granite_v3_1_8b_instruct
IndusQ-1.1Bqai_hub_models.models.indus_1b
JAIS-6p7b-Chatqai_hub_models.models.jais_6p7b_chat
Llama-SEA-LION-v3.5-8B-Rqai_hub_models.models.llama_v3_1_sea_lion_3_5_8b_r
Llama-v2-7B-Chatqai_hub_models.models.llama_v2_7b_chat
Llama-v3-8B-Instructqai_hub_models.models.llama_v3_8b_instruct
Llama-v3.1-8B-Instructqai_hub_models.models.llama_v3_1_8b_instruct
Llama-v3.2-1B-Instructqai_hub_models.models.llama_v3_2_1b_instruct
Llama-v3.2-3B-Instructqai_hub_models.models.llama_v3_2_3b_instruct
Llama3-TAIDE-LX-8B-Chat-Alpha1qai_hub_models.models.llama_v3_taide_8b_chat
Ministral-3Bqai_hub_models.models.ministral_3b
Mistral-3Bqai_hub_models.models.mistral_3b
Mistral-7B-Instruct-v0.3qai_hub_models.models.mistral_7b_instruct_v0_3
PLaMo-1Bqai_hub_models.models.plamo_1b
Phi-3.5-Mini-Instructqai_hub_models.models.phi_3_5_mini_instruct
Qwen2-7B-Instructqai_hub_models.models.qwen2_7b_instruct
Qwen2.5-7B-Instructqai_hub_models.models.qwen2_5_7b_instruct

    You need to be logged into GitHub to post a comment. If you are already logged in, please ignore this message.

    Radxa-docs © 2026 by Radxa Computer (Shenzhen) Co.,Ltd. is licensed under CC BY 4.0