CLIP
环境配置
信息
参考 RKNN 安装 配置好相关环境。
参考 RKNN Model Zoo 下载示例文件。
模型下载
下载 onnx 模型文件。
X64 Linux PC
cd rknn_model_zoo/examples/clip/model/
bash download_model.sh
模型转换
选择目标平台。
- rk3588
- rk356x
- rk3576
X64 Linux PC
export TARGET_PLATFORM=rk3588
X64 Linux PC
export TARGET_PLATFORM=rk356x
X64 Linux PC
export TARGET_PLATFORM=rk3576
将 onnx 模型转换为 rknn 模型。
X64 Linux PC
cd ../python/images/
python convert.py ../../model/clip_images.onnx ${TARGET_PLATFORM}
cd ../text/
python convert.py ../../model/clip_text.onnx ${TARGET_PLATFORM}
C API
编译示例
切换到 rknn_model_zoo 目录下执行 build-linux.sh 编译脚本。
X64 Linux PC
cd ../../../..
bash build-linux.sh -t ${TARGET_PLATFORM} -a aarch64 -d clip
文件同步
然后将编译生成的 install 目录下的 demo 目录推送到板端。
X64 Linux PC
cd install/${TARGET_PLATFORM}_linux_aarch64/
scp -r rknn_clip_demo/ user@your_device_ip:target_directory
运行示例
导出运行时库到环境变量。
Device
cd rknn_clip_demo/
export LD_LIBRARY_PATH=./lib
运行示例。
Device
./rknn_clip_demo ./model/clip_images.rknn ./model/dog_224x224.jpg ./model/clip_text.rknn ./model/text.txt
$ ./rknn_clip_demo ./model/clip_images.rknn ./model/dog_224x224.jpg ./model/clip_text.rknn ./model/text.txt
--> init clip image model
model input num: 1, output num: 1
input tensors:
index=0, name=pixel_values, n_dims=4, dims=[1, 224, 224, 3], n_elems=150528, size=301056, fmt=NHWC, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
output tensors:
index=0, name=image_embeds, n_dims=2, dims=[1, 512], n_elems=512, size=1024, fmt=UNDEFINED, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
model is NHWC input fmt
input image height=224, input image width=224, input image channel=3
--> init clip text model
model input num: 1, output num: 1
input tensors:
index=0, name=input_ids, n_dims=2, dims=[1, 20], n_elems=20, size=160, fmt=UNDEFINED, type=INT64, qnt_type=AFFINE, zp=0, scale=1.000000
output tensors:
index=0, name=text_embeds, n_dims=2, dims=[1, 512], n_elems=512, size=1024, fmt=UNDEFINED, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
model is UNDEFINED input fmt
input text batch size=1, input sequence length=20
origin size=224x224 crop size=224x224
input image: 224 x 224, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
num_lines=2
--> inference clip image model
rga_api version 1.10.1_[0]
rknn_run
--> inference clip text model
rknn_run
rknn_run
--> rknn clip demo result
images: ./model/dog_224x224.jpg
text : a photo of a dog
score : 0.989
测试图片
Python API
激活虚拟环境
Device
conda activate rknn
运行示例
将相关文件推送到板端执行下面的命令。
Device
python clip.py --img_model ../model/clip_images.rknn --text_model ../model/clip_text.rknn --target ${TARGET_PLATFORM}
$ python clip.py --img_model ../model/clip_images.rknn --text_model ../model/clip_text.rknn --target rk3588
/home/radxa/miniforge3/envs/rknn/lib/python3.12/site-packages/rknn/api/rknn.py:51: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
self.rknn_base = RKNNBase(cur_path, verbose)
I rknn-toolkit2 version: 2.3.2
I target set by user is: rk3588
W inference: The 'data_format' is not set, and its default value is 'nhwc'!
W inference: The 'data_format' is not set, and its default value is 'nhwc'!
I rknn-toolkit2 version: 2.3.2
I target set by user is: rk3588
W inference: The 'data_format' is not set, and its default value is 'nhwc'!
--> rknn clip demo result:
images: ../model/dog_224x224.jpg
text : a photo of dog
score : 0.990