CLIP

环境配置

信息

参考 RKNN 安装配置好相关环境。

参考 RKNN Model Zoo 下载示例文件。

模型下载

下载 onnx 模型文件。

X64 Linux PC

cd rknn_model_zoo/examples/clip/model/
bash download_model.sh

模型转换

选择目标平台。

rk3588
rk356x
rk3576

X64 Linux PC

export TARGET_PLATFORM=rk3588

X64 Linux PC

export TARGET_PLATFORM=rk356x

X64 Linux PC

export TARGET_PLATFORM=rk3576

将 onnx 模型转换为 rknn 模型。

X64 Linux PC

cd ../python/images/
python convert.py ../../model/clip_images.onnx ${TARGET_PLATFORM}
cd ../text/
python convert.py ../../model/clip_text.onnx ${TARGET_PLATFORM}

C API

编译示例

切换到 rknn_model_zoo 目录下执行 build-linux.sh 编译脚本。

X64 Linux PC

cd ../../../..
bash build-linux.sh -t ${TARGET_PLATFORM} -a aarch64 -d clip

文件同步

然后将编译生成的 install 目录下的 demo 目录推送到板端。

X64 Linux PC

cd install/${TARGET_PLATFORM}_linux_aarch64/
scp -r rknn_clip_demo/ user@your_device_ip:target_directory

运行示例

导出运行时库到环境变量。

Device

cd rknn_clip_demo/
export LD_LIBRARY_PATH=./lib

运行示例。

Device

./rknn_clip_demo ./model/clip_images.rknn ./model/dog_224x224.jpg ./model/clip_text.rknn ./model/text.txt

$ ./rknn_clip_demo ./model/clip_images.rknn ./model/dog_224x224.jpg ./model/clip_text.rknn ./model/text.txt
--> init clip image model
model input num: 1, output num: 1
input tensors:
  index=0, name=pixel_values, n_dims=4, dims=[1, 224, 224, 3], n_elems=150528, size=301056, fmt=NHWC, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
output tensors:
  index=0, name=image_embeds, n_dims=2, dims=[1, 512], n_elems=512, size=1024, fmt=UNDEFINED, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
model is NHWC input fmt
input image height=224, input image width=224, input image channel=3
--> init clip text model
model input num: 1, output num: 1
input tensors:
  index=0, name=input_ids, n_dims=2, dims=[1, 20], n_elems=20, size=160, fmt=UNDEFINED, type=INT64, qnt_type=AFFINE, zp=0, scale=1.000000
output tensors:
  index=0, name=text_embeds, n_dims=2, dims=[1, 512], n_elems=512, size=1024, fmt=UNDEFINED, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
model is UNDEFINED input fmt
input text batch size=1, input sequence length=20
origin size=224x224 crop size=224x224
input image: 224 x 224, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
num_lines=2
--> inference clip image model
rga_api version 1.10.1_[0]
rknn_run
--> inference clip text model
rknn_run
rknn_run
--> rknn clip demo result
images: ./model/dog_224x224.jpg
text  : a photo of a dog
score : 0.989

测试图片

Python API

激活虚拟环境

Device

conda activate rknn

运行示例

将相关文件推送到板端执行下面的命令。

Device

python clip.py --img_model ../model/clip_images.rknn --text_model ../model/clip_text.rknn --target ${TARGET_PLATFORM}

$ python clip.py --img_model ../model/clip_images.rknn --text_model ../model/clip_text.rknn --target rk3588
/home/radxa/miniforge3/envs/rknn/lib/python3.12/site-packages/rknn/api/rknn.py:51: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  self.rknn_base = RKNNBase(cur_path, verbose)
I rknn-toolkit2 version: 2.3.2
I target set by user is: rk3588
W inference: The 'data_format' is not set, and its default value is 'nhwc'!
W inference: The 'data_format' is not set, and its default value is 'nhwc'!
I rknn-toolkit2 version: 2.3.2
I target set by user is: rk3588
W inference: The 'data_format' is not set, and its default value is 'nhwc'!
--> rknn clip demo result:
images: ../model/dog_224x224.jpg
text  : a photo of dog
score : 0.990

环境配置​

模型下载​

模型转换​

C API​

编译示例​

文件同步​

运行示例​

测试图片​

Python API​

激活虚拟环境​

运行示例​

环境配置

模型下载

模型转换

C API

编译示例

文件同步

运行示例

测试图片

Python API

激活虚拟环境

运行示例