Qwen2.5-VL-3B-Instruct
此文档讲解如何在安装了瑞莎智核 AX-M1 的 host 设备上运行 Qwen2.5-VL-3B-Instruct 示例应用。
预编译模型量化方式:w8a16
下载示例应用仓库
使用 huggingfcae-cli
下载示例应用仓库。
Host
pip3 install -U "huggingface_hub[cli]"
huggingface-cli download AXERA-TECH/Qwen2.5-VL-3B-Instruct --local-dir ./Qwen2.5-VL-3B-Instruct
cd Qwen2.5-VL-3B-Instruct
示例使用
安装 python 依赖
Host
pip3 install transformers==4.53.3 jinja2==3.1.6
启动 Tokenizer 服务
- image
- video
Host
python3 qwen2_tokenizer_image_448.py --port 12345 > /dev/null 2>&1 &
(.venv) rock@rock-5b-plus:~/ssd/axera/Qwen2.5-VL-3B-Instruct$ python3 qwen2_tokenizer_image_448.py --port 12345
None None 151645 <|im_end|>
[151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 151652, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151655, 151653, 74785, 419, 2168, 13, 151645, 198, 151644, 77091, 198]
281
[151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 14990, 1879, 151645, 198, 151644, 77091, 198]
21
http://localhost:12345
Host
python3 qwen2_tokenizer_video_308.py --port 12345 > /dev/null 2>&1 &
(.venv) rock@rock-5b-plus:~/ssd/axera/Qwen2.5-VL-3B-Instruct$ python3 qwen2_tokenizer_video_308.py --port 12345
None None 151645 <|im_end|>
[151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 151652, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151656, 151653, 53481, 100158, 99487, 87140, 104597, 151645, 198, 151644, 77091, 198]
510
[151644, 8948, 198, 2610, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 14990, 1879, 151645, 198, 151644, 77091, 198]
21
http://localhost:12345
提示
如需结束后台的 Tokenizer 服务,请使用 jobs
查看后台编号,然后使用 kill %N
结束后台进程, 这里的 %N
是 jobs
下的后台编号
模型推理
- image
- video
Host
chmod +x main_axcl_aarch64
bash run_qwen2_5_vl_image_axcl_aarch64.sh
(.venv) rock@rock-5b-plus:~/ssd/axera/Qwen2.5-VL-3B-Instruct$ bash run_qwen2_5_vl_image_axcl_aarch64.sh
[I][ Init][ 162]: LLM init start
[I][ Init][ 34]: connect http://127.0.0.1:12345 ok
bos_id: -1, eos_id: 151645
img_start_token: 151652
img_context_token: 151655
2% | █ | 1 / 39 [0.00s<0.16s, 250.00 count/s] tokenizer init ok[I][ Init][ 45]: LLaMaEmbedSelector use mmap
5% | ██ | 2 / 39 [0.00s<0.08s, 500.00 count/s] embed_selector init ok
[I][ run][ 30]: AXCLWorker start with devid 0
12% | █████ | 4 / 39 [11.15s<86.99s, 0.45 count/s] init 24 axmodel ok,devid(0) remain_cmm(-1 MB) 100% | ████████████████████████████████ | 39 / 39 [54.60s<57.55s, 0.68 count/s] init post axmodel ok,remain_cmm(3220 MB)545 MB))
input size: 1
name: hidden_states [unknown] [unknown]
1 x 1024 x 392 x 3 size:1204224
output size: 1
name: hidden_states_out
256 x 2048 size:2097152
[I][ Init][ 267]: IMAGE_CONTEXT_TOKEN: 151655, IMAGE_START_TOKEN: 151652
[I][ Init][ 328]: image encoder output float32
[I][ Init][ 340]: max_token_len : 1023
[I][ Init][ 343]: kv_cache_size : 256, kv_cache_num: 1023
[I][ Init][ 351]: prefill_token_num : 128
[I][ Init][ 355]: grp: 1, prefill_max_token_num : 1
[I][ Init][ 355]: grp: 2, prefill_max_token_num : 128
[I][ Init][ 355]: grp: 3, prefill_max_token_num : 256
[I][ Init][ 355]: grp: 4, prefill_max_token_num : 384
[I][ Init][ 355]: grp: 5, prefill_max_token_num : 512
[I][ Init][ 359]: prefill_max_token_num : 512
________________________
| ID| remain cmm(MB)|
========================
| 0| 2286|
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
[E][ load_config][ 278]: config file(post_config.json) open failed
[W][ Init][ 452]: load postprocess config(post_config.json) failed
[I][ Init][ 456]: LLM init ok
Type "q" to exit, Ctrl+c to stop current running
prompt >> 描述下图片
image >> image/ssd_car.jpg
[I][ Encode][ 539]: image encode time : 787.804993 ms, size : 524288
[I][ Run][ 625]: input token num : 280, prefill_split_num : 3
[I][ Run][ 659]: input_num_token:128
[I][ Run][ 659]: input_num_token:128
[I][ Run][ 659]: input_num_token:24
[I][ Run][ 796]: ttft: 1964.38 ms
这张图片展示了一条繁忙的城市街道。前景中,一名女子站在人行道上,她穿着黑色外套,面带微笑。她旁边是一辆红色的双层巴士,巴士上有一个广告,上面写着“THINGS GET MORE EXITING WHEN YOU SAY ‘YES’”。巴士的车牌号是“L15”。巴士旁边停着一辆黑色的小型货车。背景中可以看到一些商店和行人,街道两旁的建筑物是现代的玻璃幕墙建筑。整体氛围显得繁忙而充满活力。
[N][ Run][ 949]: hit eos,avg 4.95 token/s
Host
chmod +x main_axcl_aarch64
bash run_qwen2_5_vl_video_axcl_aarch64.sh
(.venv) rock@rock-5b-plus:~/ssd/axera/Qwen2.5-VL-3B-Instruct$ bash run_qwen2_5_vl_video_axcl_aarch64.sh
[I][ Init][ 162]: LLM init start
[I][ Init][ 34]: connect http://127.0.0.1:12345 ok
bos_id: -1, eos_id: 151645
img_start_token: 151652
img_context_token: 151656
2% | █ | 1 / 39 [0.00s<0.12s, 333.33 count/s] tokenizer init ok[I][ Init][ 45]: LLaMaEmbedSelector use mmap
5% | ██ | 2 / 39 [0.00s<0.06s, 666.67 count/s] embed_selector init ok
[I][ run][ 30]: AXCLWorker start with devid 0
92% | ████████████████████████████████████████████████ █ ████ | 35 / 39 [49.00s<54.60s, 0.71 count/s] init 4 axmodel ok,devid(0) remain100% | ████████████████████████████████ | 39 / 39 [54.74s<56.18s, 0.69 count/s] init post axmodel ok,remain_cmm(3220 MB)3545 MB)
input size: 1
name: hidden_states [unknown] [unknown]
1 x 484 x 392 x 3 size:569184
output size: 1
name: hidden_states_out
121 x 2048 size:991232
[I][ Init][ 267]: IMAGE_CONTEXT_TOKEN: 151656, IMAGE_START_TOKEN: 151652
[I][ Init][ 328]: image encoder output float32
[I][ Init][ 340]: max_token_len : 1023
[I][ Init][ 343]: kv_cache_size : 256, kv_cache_num: 1023
[I][ Init][ 351]: prefill_token_num : 128
[I][ Init][ 355]: grp: 1, prefill_max_token_num : 1
[I][ Init][ 355]: grp: 2, prefill_max_token_num : 128
[I][ Init][ 355]: grp: 3, prefill_max_token_num : 256
[I][ Init][ 355]: grp: 4, prefill_max_token_num : 384
[I][ Init][ 355]: grp: 5, prefill_max_token_num : 512
[I][ Init][ 359]: prefill_max_token_num : 512
________________________
| ID| remain cmm(MB)|
========================
| 0| 2464|
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
[E][ load_config][ 278]: config file(post_config.json) open failed
[W][ Init][ 452]: load postprocess config(post_config.json) failed
[I][ Init][ 456]: LLM init ok
Type "q" to exit, Ctrl+c to stop current running
prompt >> 描述这个视频的内容
image >> video
video/frame_0000.jpg
video/frame_0008.jpg
video/frame_0016.jpg
video/frame_0024.jpg
video/frame_0032.jpg
video/frame_0040.jpg
video/frame_0048.jpg
video/frame_0056.jpg
[I][ Encode][ 539]: image encode time : 1484.723999 ms, size : 991232
[I][ Run][ 625]: input token num : 509, prefill_split_num : 4
[I][ Run][ 659]: input_num_token:128
[I][ Run][ 659]: input_num_token:128
[I][ Run][ 659]: input_num_token:128
[I][ Run][ 659]: input_num_token:125
[I][ Run][ 796]: ttft: 2931.55 ms
视频展示了两只松鼠在户外的场景。背景是模糊的山脉和蓝天,前景中有松鼠在互动。松鼠的毛色是棕色和灰色的混合,它们的爪子是橙色的。松鼠似乎在互相玩耍或争抢,它们的爪子和嘴巴都伸向对方。整个场景显得非常自然和生动。
[N][ Run][ 949]: hit eos,avg 4.89 token/s
提示
请检查 run_xxx.sh 运行脚本中 tokenizer_model 的端口是否与 Tokenizer 服务端口一致
性能参考
模型 | 量化方式 | host 设备 | token/s |
---|---|---|---|
Qwen2.5-VL-3B-Instruct | w8a16 | ROCK 5B+ | 4.95 |