PP-OCRv5 PyTorch Model Zoo
PyTorch weights (safetensors format) for the full PP-OCRv5 family, converted bit-exactly from the official PaddlePaddle .pdparams dynamic-graph weights β inference outputs are identical to the original PaddleOCR down to float32 precision.
- Text Detection: 2 models (mobile / server)
- Text Recognition (base): 2 models covering Simplified Chinese / Traditional Chinese / English / Japanese
- Text Recognition (multilingual): 11 models covering 100+ languages (Korean, French, German, Russian, Arabic, Devanagari, Thai, Greek, Tamil, Telugu, etc.)
This repo contains weights + configs + dictionaries only, not inference code. For inference, use PaddleOCR2Pytorch, or follow the "Custom Python Inference" section below.
Also available: README_zh.md (δΈζη).
Repository Layout
.
βββ README.md / README_zh.md
βββ LICENSE # Apache 2.0
βββ config.json # Repo metadata + model index
β
βββ ptocr_v5_*.safetensors # 15 PP-OCRv5 weights at root (stable URLs)
βββ ptocr_v5_server_{det,rec}.pth # Legacy pth copies of V5 server (kept)
β
βββ configs/
β βββ det/PP-OCRv5/
β β βββ PP-OCRv5_mobile_det.yml
β β βββ PP-OCRv5_server_det.yml
β βββ rec/PP-OCRv5/
β βββ PP-OCRv5_mobile_rec.yml # zh / zh-Hant / en / ja
β βββ PP-OCRv5_server_rec.yml
β βββ multi_language/ # 11 multilingual rec yamls
β βββ en_PP-OCRv5_mobile_rec.yaml
β βββ korean_PP-OCRv5_mobile_rec.yml
β βββ latin_PP-OCRv5_mobile_rec.yml # French / German / Spanish / ... (40+ Latin-script)
β βββ eslav_PP-OCRv5_mobile_rec.yml # Russian / Belarusian / Ukrainian
β βββ cyrillic_PP-OCRv5_mobile_rec.yaml # 33 Cyrillic-script languages
β βββ arabic_PP-OCRv5_mobile_rec.yaml # Arabic / Persian / Uyghur / Urdu / ...
β βββ devanagari_PP-OCRv5_mobile_rec.yaml # Hindi / Marathi / Nepali / Sanskrit / ...
β βββ th_PP-OCRv5_mobile_rec.yaml # Thai
β βββ el_PP-OCRv5_mobile_rec.yaml # Greek
β βββ ta_PP-OCRv5_mobile_rec.yaml # Tamil
β βββ te_PP-OCRv5_mobile_rec.yaml # Telugu
β
βββ dicts/ # Character set dictionaries (required for rec)
β βββ ppocrv5_dict.txt # base (zh / zh-Hant / en / ja)
β βββ ppocrv5_<lang>_dict.txt # 11 multilingual dicts
β
βββ legacy/ # Older PP-OCR v2/v3/v4 weights (kept for back-compat)
βββ ch_ptocr_mobile_v2.0_cls_infer.pth
βββ ch_ptocr_v4_det_infer.pth
βββ ch_ptocr_v4_rec_infer.pth
βββ en_ptocr_v3_det_infer.pth
βββ en_ptocr_v4_rec_infer.pth
All rec yamls use relative
character_dict_path: ./dicts/.... Aftergit cloneorsnapshot_download, paths resolve correctly with no modification required.
Model Catalog
Text Detection
| Weight | Config | Use case | Size |
|---|---|---|---|
ptocr_v5_mobile_det.safetensors |
configs/det/PP-OCRv5/PP-OCRv5_mobile_det.yml |
Mobile / CPU-friendly | ~14 MB |
ptocr_v5_server_det.safetensors |
configs/det/PP-OCRv5/PP-OCRv5_server_det.yml |
Server / high-accuracy | ~101 MB |
Text Recognition (Base)
| Weight | Config | Languages | Size |
|---|---|---|---|
ptocr_v5_mobile_rec.safetensors |
configs/rec/PP-OCRv5/PP-OCRv5_mobile_rec.yml |
Simplified / Traditional Chinese, English, Japanese | ~31 MB |
ptocr_v5_server_rec.safetensors |
configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml |
same as above, higher accuracy | ~128 MB |
Text Recognition (Multilingual)
All multilingual rec models share the same architecture (SVTR_LCNet + PPLCNetV3); they differ only by character dictionary. File size 23β28 MB each.
| Weight | Supported languages |
|---|---|
ptocr_v5_en_mobile_rec.safetensors |
English (dedicated model optimized for English-only scenarios) |
ptocr_v5_korean_mobile_rec.safetensors |
Korean, English |
ptocr_v5_latin_mobile_rec.safetensors |
French, German, Spanish, Italian, Portuguese, Dutch, Swedish, Danish, Norwegian, Finnish, Polish, Czech, Turkish, Vietnamese, ... (40+ Latin-script) |
ptocr_v5_eslav_mobile_rec.safetensors |
Russian, Belarusian, Ukrainian, English |
ptocr_v5_cyrillic_mobile_rec.safetensors |
33 Cyrillic-script languages (Russian, Serbian-Cyrillic, Bulgarian, Mongolian, Kazakh, ...) |
ptocr_v5_arabic_mobile_rec.safetensors |
Arabic, Persian, Uyghur, Urdu, Pashto, Sindhi, ... |
ptocr_v5_devanagari_mobile_rec.safetensors |
14 Devanagari-script languages (Hindi, Marathi, Nepali, Sanskrit, ...) |
ptocr_v5_th_mobile_rec.safetensors |
Thai, English |
ptocr_v5_el_mobile_rec.safetensors |
Greek, English |
ptocr_v5_ta_mobile_rec.safetensors |
Tamil, English |
ptocr_v5_te_mobile_rec.safetensors |
Telugu, English |
Quick Start
Download Weights
from huggingface_hub import snapshot_download, hf_hub_download
# Option 1: download the whole repo (weights + configs + dicts + README)
repo_dir = snapshot_download(repo_id="JoyCN/PaddleOCR-Pytorch")
print("downloaded to:", repo_dir)
# Option 2: fetch a single weight file
weight_path = hf_hub_download(
repo_id="JoyCN/PaddleOCR-Pytorch",
filename="ptocr_v5_korean_mobile_rec.safetensors",
)
Inference via PaddleOCR2Pytorch (Recommended)
# 1. clone the inference code repo
git clone https://github.com/frotms/PaddleOCR2Pytorch
cd PaddleOCR2Pytorch
pip install torch safetensors pyyaml shapely pyclipper opencv-python pillow scikit-image
# 2. Assume you ran snapshot_download above into /path/to/hf_repo
python tools/infer/predict_rec.py \
--image_dir doc/imgs_words/korean/1.jpg \
--rec_algorithm SVTR_LCNet \
--rec_model_path /path/to/hf_repo/ptocr_v5_korean_mobile_rec.safetensors \
--rec_yaml_path /path/to/hf_repo/configs/rec/PP-OCRv5/multi_language/korean_PP-OCRv5_mobile_rec.yml \
--rec_image_shape "3,48,320" \
--rec_char_dict_path /path/to/hf_repo/dicts/ppocrv5_korean_dict.txt \
--use_gpu False
PaddleOCR2Pytorch
base_ocr_v20.pyauto-detects.safetensorsvs.pthby extension (backward compatible).
Custom Python Inference
A minimal skeleton showing how to load the weights and run a forward pass. You still need the network definitions from the PaddleOCR2Pytorch pytorchocr/modeling/ package.
import sys, numpy as np, cv2, torch, yaml
from safetensors.torch import load_file
# Requires https://github.com/frotms/PaddleOCR2Pytorch on PYTHONPATH
sys.path.insert(0, "/path/to/PaddleOCR2Pytorch")
from pytorchocr.modeling.architectures.base_model import BaseModel
from pytorchocr.postprocess import build_post_process
HF_REPO = "/path/to/hf_repo" # the path returned by snapshot_download
yml_path = f"{HF_REPO}/configs/rec/PP-OCRv5/multi_language/korean_PP-OCRv5_mobile_rec.yml"
weight_path = f"{HF_REPO}/ptocr_v5_korean_mobile_rec.safetensors"
# 1. load config + dictionary
with open(yml_path, encoding="utf-8") as f:
cfg = yaml.safe_load(f)
dict_path = cfg["Global"]["character_dict_path"] # './dicts/ppocrv5_korean_dict.txt'
dict_abs = f"{HF_REPO}/{dict_path.lstrip('./')}"
with open(dict_abs, encoding="utf-8") as f:
chars = [l.strip("\n\r") for l in f]
n_char = len(chars) + 2 # +1 blank, +1 space (if use_space_char)
# 2. build network + load weights (safetensors = zero-code-exec, mmap-fast)
cfg["Architecture"]["Head"]["out_channels_list"] = {
"CTCLabelDecode": n_char,
"SARLabelDecode": n_char + 2,
"NRTRLabelDecode": n_char + 3,
}
net = BaseModel(cfg["Architecture"], out_channels=n_char)
net.load_state_dict(load_file(weight_path, device="cpu"))
net.eval()
# 3. preprocess (resize to [3, 48, 320], normalize to [-1, 1])
img = cv2.imread("input_word.jpg")
h, w = img.shape[:2]
ratio = w / h
tw = min(int(48 * ratio), 320)
img = cv2.resize(img, (tw, 48))
canvas = np.zeros((48, 320, 3), dtype=np.uint8)
canvas[:, :tw] = img
x = canvas.astype(np.float32).transpose(2, 0, 1) / 255.0
x = (x - 0.5) / 0.5
x = torch.from_numpy(x).unsqueeze(0)
# 4. forward + CTC decode
with torch.no_grad():
logits = net(x)
post_op = build_post_process({
"name": "CTCLabelDecode",
"character_dict_path": dict_abs,
"use_space_char": True,
})
result = post_op(logits)
print("prediction:", result) # e.g. [('λ°νμΌλ‘', 0.9998)]
Runtime Dependencies
torch >= 1.13
safetensors >= 0.4
numpy, pillow, opencv-python
pyyaml, shapely, pyclipper
scikit-image # required by det post-processing
Conversion & Verification
- Source weights: official PaddlePaddle
.pdparamsfromhttps://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/ - Converter: PaddleOCR2Pytorch β scripts
converter/ppocr_v5_det_converter.py/converter/ppocr_v5_rec_converter.py - Verification: end-to-end inference was run on macOS Apple Silicon (M-series) CPU; multilingual rec outputs are bit-exact with the original PaddleOCR
.pdparams(float32 values match to 8 decimal places).
Sample inference results (CPU, < 0.7 s / image):
| Sample | Prediction | Confidence |
|---|---|---|
Chinese word_1.jpg |
ι©ε½ε°ι¦ | 0.99797755 |
Korean korean/1.jpg |
λ°νμΌλ‘ | 0.99977183 |
French french/1.jpg |
de l'amendement, | 0.99656343 |
Arabic arabic/ar_1.jpg |
Ψ§ΩΩΩΨ΅ΩΨ§ΩΩ | 0.68281130 |
Legacy Files (legacy/)
Older PP-OCR (v2 / v3 / v4) checkpoints previously at the repo root have been moved into legacy/ for clarity. They are still present and continue to work β just add the legacy/ prefix to your path.
If you were previously using any of these URLs at the root:
legacy/ch_ptocr_mobile_v2.0_cls_infer.pth
legacy/ch_ptocr_v4_det_infer.pth
legacy/ch_ptocr_v4_rec_infer.pth
legacy/en_ptocr_v3_det_infer.pth
legacy/en_ptocr_v4_rec_infer.pth
The 15 PP-OCRv5 safetensors files remain at the repo root β their URLs did not change.
License & Credits
- License: Apache License 2.0
- Weights originate from PaddleOCR by the PaddlePaddle team (Apache 2.0).
- Converted with PaddleOCR2Pytorch (Apache 2.0).
If this repo helps you, please also star both of those original projects.
Citation
@misc{pp_ocrv5_pytorch_joycn_2025,
title = {PP-OCRv5 PyTorch Model Zoo},
author = {JoyCN},
howpublished = {\url{https://huggingface.co/JoyCN/PaddleOCR-Pytorch}},
year = {2025}
}
- Downloads last month
- 146