Skip to main content

PP-OCRv4

PP-OCR is an open-source general-purpose OCR model family developed by Baidu. It uses a complete end-to-end vision recognition pipeline, covering three core modules: text detection, direction classification, and text recognition, aiming to provide robust text extraction that works reliably in a wide range of complex environments.

  • Key features: Supports high-accuracy multilingual text extraction and recognition, with strong background-noise suppression and robustness to skewed or blurry text. It is widely used in document digitization, industrial inspection, license-plate recognition, and autonomous-driving scenarios.
  • Version notes: This example uses PP-OCRv4. As the latest advanced version in the series, it introduces a lighter yet stronger detection architecture and recognition distillation techniques, significantly improving accuracy for small text and rare characters without additional compute overhead. It is a common lightweight choice that balances accuracy and extreme inference speed for real-time mobile text analysis.
Environment setup

You need to set up the environment in advance.

Quick start

Download model files

O6 / O6N
cd ai_model_hub_25_Q3/models/ComputeVision/OCR/onnx_PP_OCRv4
wget https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/ComputeVision/OCR/onnx_PP_OCRv4/cls.cix
wget https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/ComputeVision/OCR/onnx_PP_OCRv4/PP-OCRv4_det.cix
wget https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/ComputeVision/OCR/onnx_PP_OCRv4/rec.cix

Test the model

info

Activate the virtual environment before running.

O6 / O6N
python3 inference_npu.py

Full conversion workflow

Download model files

Linux PC
cd ai_model_hub_25_Q3/models/ComputeVision/OCR/onnx_PP_OCRv4/model
wget https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/ComputeVision/OCR/onnx_PP_OCRv4/model/cls.onnx
wget https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/ComputeVision/OCR/onnx_PP_OCRv4/model/PP-OCRv4_det.onnx
wget https://www.modelscope.cn/models/cix/ai_model_hub_25_Q3/resolve/master/models/ComputeVision/OCR/onnx_PP_OCRv4/model/rec.onnx

Project structure

├── cfg
├── cls.cix
├── datasets
├── inference_npu.py
├── inference_onnx.py
├── model
├── ppocr_keys_v1.txt
├── pp_ocr.py
├── PP-OCRv4_det.cix
├── ReadMe.md
├── rec.cix
├── simfang.ttf
└── test_data

Quantize and convert the model

Convert the detection module

Linux PC
cd ..
cixbuild cfg/detbuild.cfg

Convert the classification module

Linux PC
cixbuild cfg/clsbuild.cfg

Convert the recognition module

Linux PC
cixbuild cfg/recbuild.cfg
Copy to device

After conversion, copy the .cix model files to the device.

Test inference on the host

Run the inference script

Linux PC
python3 inference_onnx.py

Inference output

Linux PC
$ python3 inference_onnx.py
[[[36.0, 409.0], [486.0, 386.0], [489.0, 434.0], [38.0, 457.0]], ('<sample_text_1>', 0.9942322969436646)]
[[[183.0, 453.0], [401.0, 444.0], [403.0, 485.0], [185.0, 494.0]], ('<sample_text_2>', 0.9480939507484436)]
[[[14.0, 501.0], [519.0, 483.0], [521.0, 537.0], [15.0, 555.0]], ('<sample_text_3>', 0.9961597919464111)]
[[[73.0, 550.0], [451.0, 539.0], [452.0, 576.0], [74.0, 587.0]], ('<sample_text_4>', 0.9754183292388916)]
[[[292.0, 295.0], [335.0, 294.0], [350.0, 852.0], [307.0, 853.0]], ('<sample_text_5>', 0.9570525288581848)]
[[[343.0, 298.0], [380.0, 297.0], [389.0, 665.0], [352.0, 666.0]], ('<sample_text_6>', 0.9861757755279541)]
[[[34.0, 79.0], [440.0, 82.0], [439.0, 174.0], [33.0, 171.0]], ('<sample_text_7>', 0.9949513673782349)]
[[[31.0, 183.0], [253.0, 183.0], [253.0, 243.0], [31.0, 243.0]], ('<sample_text_8>', 0.9937998652458191)]
[[[39.0, 258.0], [469.0, 258.0], [469.0, 309.0], [39.0, 309.0]], ('<sample_text_9>', 0.9810954928398132)]
[[[35.0, 325.0], [410.0, 327.0], [409.0, 382.0], [34.0, 380.0]], ('<sample_text_10>', 0.999457061290741)]
[[[34.0, 406.0], [435.0, 406.0], [435.0, 454.0], [34.0, 454.0]], ('<sample_text_11>', 0.9994476437568665)]
[[[32.0, 477.0], [341.0, 474.0], [341.0, 526.0], [32.0, 528.0]], ('<sample_text_12>', 0.9984829425811768)]
[[[32.0, 549.0], [353.0, 549.0], [353.0, 600.0], [32.0, 600.0]], ('<sample_text_13>', 0.9997670650482178)]
[[[30.0, 621.0], [263.0, 617.0], [264.0, 668.0], [31.0, 672.0]], ('<sample_text_14>', 0.9565265774726868)]
[[[33.0, 692.0], [365.0, 695.0], [364.0, 743.0], [33.0, 740.0]], ('<sample_text_15>', 0.9993946552276611)]
[[[32.0, 763.0], [499.0, 766.0], [498.0, 816.0], [32.0, 813.0]], ('<sample_text_16>', 0.9533663392066956)]
[[[38.0, 840.0], [407.0, 840.0], [407.0, 884.0], [38.0, 884.0]], ('<sample_text_17>', 0.9451590776443481)]
[[[525.0, 842.0], [690.0, 842.0], [690.0, 898.0], [525.0, 898.0]], ('<sample_text_18>', 0.9980840682983398)]
[[[34.0, 910.0], [522.0, 910.0], [522.0, 957.0], [34.0, 957.0]], ('<sample_text_19>', 0.9985333681106567)]
[[[39.0, 983.0], [536.0, 983.0], [536.0, 1027.0], [39.0, 1027.0]], ('<sample_text_20>', 0.9993751645088196)]
[[[32.0, 1051.0], [201.0, 1048.0], [202.0, 1104.0], [33.0, 1107.0]], ('<sample_text_21>', 0.9753393530845642)]

Deploy on NPU

Run the inference script

O6 / O6N
python3 inference_npu.py

Runtime output

O6 / O6N
$ python3 inference_npu.py
npu: noe_init_context success
npu: noe_load_graph success
Input tensor count is 1.
Output tensor count is 1.
npu: noe_create_job success
npu: noe_init_context success
npu: noe_load_graph success
Input tensor count is 1.
Output tensor count is 1.
npu: noe_create_job success
npu: noe_init_context success
npu: noe_load_graph success
Input tensor count is 1.
Output tensor count is 1.
npu: noe_create_job success
[[[36.0, 409.0], [486.0, 386.0], [489.0, 434.0], [38.0, 457.0]], ('<sample_text_1>', 0.9929969906806946)]
[[[141.0, 456.0], [403.0, 444.0], [404.0, 483.0], [143.0, 495.0]], ('<sample_text_2>', 0.862202525138855)]
[[[17.0, 505.0], [519.0, 484.0], [521.0, 535.0], [19.0, 555.0]], ('<sample_text_3>', 0.9960622787475586)]
[[[67.0, 550.0], [418.0, 539.0], [420.0, 578.0], [68.0, 590.0]], ('<sample_text_4>', 0.9729113578796387)]
[[[34.0, 78.0], [442.0, 80.0], [441.0, 174.0], [33.0, 171.0]], ('<sample_text_5>', 0.9860424399375916)]
[[[30.0, 181.0], [255.0, 181.0], [255.0, 244.0], [30.0, 244.0]], ('<sample_text_6>', 0.949313759803772)]
[[[39.0, 258.0], [478.0, 258.0], [478.0, 309.0], [39.0, 309.0]], ('<sample_text_7>', 0.9828777313232422)]
[[[36.0, 321.0], [411.0, 325.0], [411.0, 384.0], [35.0, 380.0]], ('<sample_text_8>', 0.9913207292556763)]
[[[37.0, 406.0], [432.0, 406.0], [432.0, 450.0], [37.0, 450.0]], ('<sample_text_9>', 0.9849441051483154)]
[[[31.0, 475.0], [342.0, 472.0], [342.0, 527.0], [31.0, 530.0]], ('<sample_text_10>', 0.9962107539176941)]
[[[593.0, 539.0], [623.0, 539.0], [623.0, 700.0], [593.0, 700.0]], ('ODM OEM', 0.9357462525367737)]
[[[31.0, 549.0], [353.0, 546.0], [353.0, 599.0], [31.0, 601.0]], ('<sample_text_11>', 0.9970366358757019)]
[[[29.0, 620.0], [264.0, 617.0], [264.0, 668.0], [30.0, 671.0]], ('<sample_text_12>', 0.9971547722816467)]
[[[33.0, 691.0], [367.0, 694.0], [367.0, 742.0], [33.0, 739.0]], ('<sample_text_13>', 0.9611490964889526)]
[[[33.0, 764.0], [497.0, 767.0], [497.0, 813.0], [33.0, 811.0]], ('<sample_text_14>', 0.9434943795204163)]
[[[37.0, 839.0], [409.0, 839.0], [409.0, 886.0], [37.0, 886.0]], ('<sample_text_15>', 0.9171066880226135)]
[[[526.0, 843.0], [689.0, 843.0], [689.0, 896.0], [526.0, 896.0]], ('<sample_text_16>', 0.8261211514472961)]
[[[33.0, 908.0], [522.0, 910.0], [522.0, 957.0], [33.0, 955.0]], ('<sample_text_17>', 0.9950319528579712)]
[[[39.0, 983.0], [536.0, 983.0], [536.0, 1027.0], [39.0, 1027.0]], ('<sample_text_18>', 0.9946616291999817)]
[[[34.0, 1051.0], [201.0, 1051.0], [201.0, 1103.0], [34.0, 1103.0]], ('<sample_text_19>', 0.9353836178779602)]
[[[292.0, 297.0], [335.0, 295.0], [350.0, 850.0], [307.0, 851.0]], ('<sample_text_20>', 0.976573646068573)]
[[[344.0, 299.0], [381.0, 298.0], [387.0, 662.0], [351.0, 663.0]], ('<sample_text_21>', 0.9912211298942566)]
npu: noe_clean_job success
npu: noe_unload_graph success
npu: noe_deinit_context success
npu: noe_clean_job success
npu: noe_unload_graph success
npu: noe_deinit_context success
npu: noe_clean_job success
npu: noe_unload_graph success
npu: noe_deinit_context success

    You need to be logged into GitHub to post a comment. If you are already logged in, please ignore this message.

    Radxa-docs © 2026 by Radxa Computer (Shenzhen) Co.,Ltd. is licensed under CC BY 4.0