QAI Hub Models
Qualcomm® AI Hub Models(QAI-Hub-Models) 基于 QAI-Hub 提供的云服务,支持以命令行方式将模型列表中的模型在云设备上进行在线量化、编译、推理、分析和下载。
使用方法
安装 qai_hub_models
pip3 install qai_hub_models
配置 API Token
请先在 Qualcomm® AI Hub 上进行账户注册并登陆,获取用户 API Token
qai-hub configure --api_token API_TOKEN
使用示例
QAI HUB MODELS 模型编译示例
这里以编译 RealESRGAN_x4plus 为可以使用 NPU 进行推理的输入大小为 128x128 w8a8 量化 的 Context-Binary 模型格式为例子, 简单介绍使用 Qualcomm® AI Hub Models 进行模型编译的使用方法。
- QCS6490
- SC8280XP
- QCS9075
export PRODUCT_CHIP=qualcomm-qcs6490-proxy
export PRODUCT_CHIP=qualcomm-sc8280xp-proxy
export PRODUCT_CHIP=qualcomm-qcs9075-proxy
python3 -m qai_hub_models.models.real_esrgan_x4plus.export --chipset ${PRODUCT_CHIP} --target-runtime qnn_context_binary --height 128 --width 128 --quantize w8a8 --num-calibration-samples 10
--chipset 指定目标运行的芯片
--target-runtime 指定目标运行时
--height 目标模型输入高度
--width 目标模型输入宽度
--quantize 指定量化方式
--num-calibration-samples 指定量化校准集图片数量
qai_hub_models.models.real_esrgan_x4plus.export 的详细使用方法请使用 --help 查看
模型编译的所有信息和日志可以在 Qualcomm® AI Hub 里的 JOBS 进行查看
(.venv) (base) radxa@vms-max:/mnt/sda1/qualcomm/qai-hub/ai_hub_model$ python3 -m qai_hub_models.models.real_esrgan_x4plus.export --chipset "qualcomm-qcs6490-proxy" --target-runtime qnn_context_binary --height 128 --width 128 --quantize w8a8 --num-calibration-samples 10
Quantizing model real_esrgan_x4plus.
Uploading tmplaenxfnu.pt
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 64.8M/64.8M [00:06<00:00, 11.1MB/s]
Scheduled compile job (jgon7zqkp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgon7zqkp/
Loading 10 calibration samples.
Waiting for compile job (jgon7zqkp) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Uploading dataset: 700kB [00:01, 522kB/s]
Scheduled quantize job (jpew0o3vp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jpew0o3vp/
Waiting for quantize job (jpew0o3vp) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Optimizing model real_esrgan_x4plus to run on-device
Scheduled compile job (jgdqynlz5) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgdqynlz5/
Profiling model real_esrgan_x4plus on a hosted device.
Waiting for compile job (jgdqynlz5) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
Scheduled profile job (jp4d6n01p) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jp4d6n01p/
Running inference for real_esrgan_x4plus on a hosted device with example inputs.
Downloading data at https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/super_resolution/v2/super_resolution_input.jpg to /home/zifeng/.qaihm/models/super_resolution/v2/super_resolution_input.jpg
100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 16.5k/16.5k [00:00<00:00, 92.2MB/s]
Done
Uploading dataset: 104kB [00:01, 100kB/s]
Scheduled inference job (jpx6892lp) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jpx6892lp/
real_esrgan_x4plus.bin: 100%|██████████████████████████████████████████████████████████████████████████| 21.5M/21.5M [00:02<00:00, 7.97MB/s]
Downloaded model to /mnt/sda1/qualcomm/qai-hub/ai_hub_model/build/real_esrgan_x4plus/real_esrgan_x4plus.bin
Waiting for profile job (jp4d6n01p) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
------------------------------------------------------------
Performance results on-device for Real_Esrgan_X4Plus.
------------------------------------------------------------
Device : QCS6490 (Proxy) (ANDROID 12)
Runtime : QNN_CONTEXT_BINARY
Estimated inference time (ms) : 171.8
Estimated peak memory usage (MB): [0, 13]
Total # Ops : 1027
Compute Unit(s) : npu (1027 ops) gpu (0 ops) cpu (0 ops)
------------------------------------------------------------
More details: https://app.aihub.qualcomm.com/jobs/jp4d6n01p/
tmpz9r34tur.h5: 100%|███████████████████████████████████████████████████████████████████████████████████| 1.05M/1.05M [00:03<00:00, 357kB/s]
Comparing on-device vs. local-cpu inference for Real_Esrgan_X4Plus.
+----------------+------------------+--------+
| output_name | shape | psnr |
+================+==================+========+
| upscaled_image | (1, 512, 512, 3) | 24.49 |
+----------------+------------------+--------+
- psnr: Peak Signal-to-Noise Ratio (PSNR). >30 dB is typically considered good.
More details: https://app.aihub.qualcomm.com/jobs/jpx6892lp/
Run compiled model on a hosted device on sample data using:
python /mnt/sda1/qualcomm/qai-hub/.venv/lib/python3.10/site-packages/qai_hub_models/models/real_esrgan_x4plus/demo.py --eval-mode on-device --hub-model-id mm63gld2m --chipset qualcomm-qcs6490-proxy
QAI HUB MODELS 模型推理 Demo
根据用户模型编译示例中最后的打引提示,可以在云端设备上推理编译后模型并查看推理结果
请修改 hub-model-id 参数为模型编译结果最后打印的 hub-model-id

hub-model-id 具体参数位置
python3 -m qai_hub_models.models.real_esrgan_x4plus.demo --eval-mode on-device --hub-model-id mm63gld2m --chipset ${PRODUCT_CHIP}
(.venv) (gen_py3.10) radxa@vms-max:~/Job/git_clone/ai-hub-models$ python3 -m qai_hub_models.models.real_esrgan_x4plus.demo --eval-mode on-device --hub-model-id mm63gld2m --chipset qualcomm-qcs6490-proxy
/mnt/sda1/git_clone/ai-hub-models/.venv/lib/python3.10/site-packages/qai_hub_models/utils/onnx_torch_wrapper.py:22: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Model Loaded
Uploading dataset: 104kB [00:02, 52.5kB/s]
Scheduled inference job (jgj29j8e5) successfully. To see the status and results:
https://app.aihub.qualcomm.com/jobs/jgj29j8e5/
Waiting for inference job (jgj29j8e5) completion. Type Ctrl+C to stop waiting at any time.
✅ SUCCESS
tmpe9b4q_c2.h5: 100%|██████████████████████████████████████████████████████████████████████████████████| 1.03M/1.03M [00:00<00:00, 2.04MB/s]
Displaying original image
Displaying upscaled image

左边为结果图,右边为输入图
模型列表
Computer Vision
Multimodal
| Model | README |
|---|---|
| EasyOCR | qai_hub_models.models.easyocr |
| Nomic-Embed-Text | qai_hub_models.models.nomic_embed_text |
| OpenAI-Clip | qai_hub_models.models.openai_clip |
| TrOCR | qai_hub_models.models.trocr |