PP-OCRv5 PyTorch Model Zoo

PyTorch weights (safetensors format) for the full PP-OCRv5 family, converted bit-exactly from the official PaddlePaddle .pdparams dynamic-graph weights β€” inference outputs are identical to the original PaddleOCR down to float32 precision.

  • Text Detection: 2 models (mobile / server)
  • Text Recognition (base): 2 models covering Simplified Chinese / Traditional Chinese / English / Japanese
  • Text Recognition (multilingual): 11 models covering 100+ languages (Korean, French, German, Russian, Arabic, Devanagari, Thai, Greek, Tamil, Telugu, etc.)

This repo contains weights + configs + dictionaries only, not inference code. For inference, use PaddleOCR2Pytorch, or follow the "Custom Python Inference" section below.

Also available: README_zh.md (δΈ­ζ–‡η‰ˆ).


Repository Layout

.
β”œβ”€β”€ README.md / README_zh.md
β”œβ”€β”€ LICENSE                                                 # Apache 2.0
β”œβ”€β”€ config.json                                             # Repo metadata + model index
β”‚
β”œβ”€β”€ ptocr_v5_*.safetensors                                  # 15 PP-OCRv5 weights at root (stable URLs)
β”œβ”€β”€ ptocr_v5_server_{det,rec}.pth                           # Legacy pth copies of V5 server (kept)
β”‚
β”œβ”€β”€ configs/
β”‚   β”œβ”€β”€ det/PP-OCRv5/
β”‚   β”‚   β”œβ”€β”€ PP-OCRv5_mobile_det.yml
β”‚   β”‚   └── PP-OCRv5_server_det.yml
β”‚   └── rec/PP-OCRv5/
β”‚       β”œβ”€β”€ PP-OCRv5_mobile_rec.yml                         # zh / zh-Hant / en / ja
β”‚       β”œβ”€β”€ PP-OCRv5_server_rec.yml
β”‚       └── multi_language/                                 # 11 multilingual rec yamls
β”‚           β”œβ”€β”€ en_PP-OCRv5_mobile_rec.yaml
β”‚           β”œβ”€β”€ korean_PP-OCRv5_mobile_rec.yml
β”‚           β”œβ”€β”€ latin_PP-OCRv5_mobile_rec.yml               # French / German / Spanish / ...  (40+ Latin-script)
β”‚           β”œβ”€β”€ eslav_PP-OCRv5_mobile_rec.yml               # Russian / Belarusian / Ukrainian
β”‚           β”œβ”€β”€ cyrillic_PP-OCRv5_mobile_rec.yaml           # 33 Cyrillic-script languages
β”‚           β”œβ”€β”€ arabic_PP-OCRv5_mobile_rec.yaml             # Arabic / Persian / Uyghur / Urdu / ...
β”‚           β”œβ”€β”€ devanagari_PP-OCRv5_mobile_rec.yaml         # Hindi / Marathi / Nepali / Sanskrit / ...
β”‚           β”œβ”€β”€ th_PP-OCRv5_mobile_rec.yaml                 # Thai
β”‚           β”œβ”€β”€ el_PP-OCRv5_mobile_rec.yaml                 # Greek
β”‚           β”œβ”€β”€ ta_PP-OCRv5_mobile_rec.yaml                 # Tamil
β”‚           └── te_PP-OCRv5_mobile_rec.yaml                 # Telugu
β”‚
β”œβ”€β”€ dicts/                                                  # Character set dictionaries (required for rec)
β”‚   β”œβ”€β”€ ppocrv5_dict.txt                                    # base (zh / zh-Hant / en / ja)
β”‚   └── ppocrv5_<lang>_dict.txt                             # 11 multilingual dicts
β”‚
└── legacy/                                                 # Older PP-OCR v2/v3/v4 weights (kept for back-compat)
    β”œβ”€β”€ ch_ptocr_mobile_v2.0_cls_infer.pth
    β”œβ”€β”€ ch_ptocr_v4_det_infer.pth
    β”œβ”€β”€ ch_ptocr_v4_rec_infer.pth
    β”œβ”€β”€ en_ptocr_v3_det_infer.pth
    └── en_ptocr_v4_rec_infer.pth

All rec yamls use relative character_dict_path: ./dicts/.... After git clone or snapshot_download, paths resolve correctly with no modification required.


Model Catalog

Text Detection

Weight Config Use case Size
ptocr_v5_mobile_det.safetensors configs/det/PP-OCRv5/PP-OCRv5_mobile_det.yml Mobile / CPU-friendly ~14 MB
ptocr_v5_server_det.safetensors configs/det/PP-OCRv5/PP-OCRv5_server_det.yml Server / high-accuracy ~101 MB

Text Recognition (Base)

Weight Config Languages Size
ptocr_v5_mobile_rec.safetensors configs/rec/PP-OCRv5/PP-OCRv5_mobile_rec.yml Simplified / Traditional Chinese, English, Japanese ~31 MB
ptocr_v5_server_rec.safetensors configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml same as above, higher accuracy ~128 MB

Text Recognition (Multilingual)

All multilingual rec models share the same architecture (SVTR_LCNet + PPLCNetV3); they differ only by character dictionary. File size 23–28 MB each.

Weight Supported languages
ptocr_v5_en_mobile_rec.safetensors English (dedicated model optimized for English-only scenarios)
ptocr_v5_korean_mobile_rec.safetensors Korean, English
ptocr_v5_latin_mobile_rec.safetensors French, German, Spanish, Italian, Portuguese, Dutch, Swedish, Danish, Norwegian, Finnish, Polish, Czech, Turkish, Vietnamese, ... (40+ Latin-script)
ptocr_v5_eslav_mobile_rec.safetensors Russian, Belarusian, Ukrainian, English
ptocr_v5_cyrillic_mobile_rec.safetensors 33 Cyrillic-script languages (Russian, Serbian-Cyrillic, Bulgarian, Mongolian, Kazakh, ...)
ptocr_v5_arabic_mobile_rec.safetensors Arabic, Persian, Uyghur, Urdu, Pashto, Sindhi, ...
ptocr_v5_devanagari_mobile_rec.safetensors 14 Devanagari-script languages (Hindi, Marathi, Nepali, Sanskrit, ...)
ptocr_v5_th_mobile_rec.safetensors Thai, English
ptocr_v5_el_mobile_rec.safetensors Greek, English
ptocr_v5_ta_mobile_rec.safetensors Tamil, English
ptocr_v5_te_mobile_rec.safetensors Telugu, English

Quick Start

Download Weights

from huggingface_hub import snapshot_download, hf_hub_download

# Option 1: download the whole repo (weights + configs + dicts + README)
repo_dir = snapshot_download(repo_id="JoyCN/PaddleOCR-Pytorch")
print("downloaded to:", repo_dir)

# Option 2: fetch a single weight file
weight_path = hf_hub_download(
    repo_id="JoyCN/PaddleOCR-Pytorch",
    filename="ptocr_v5_korean_mobile_rec.safetensors",
)

Inference via PaddleOCR2Pytorch (Recommended)

# 1. clone the inference code repo
git clone https://github.com/frotms/PaddleOCR2Pytorch
cd PaddleOCR2Pytorch
pip install torch safetensors pyyaml shapely pyclipper opencv-python pillow scikit-image

# 2. Assume you ran snapshot_download above into /path/to/hf_repo
python tools/infer/predict_rec.py \
  --image_dir doc/imgs_words/korean/1.jpg \
  --rec_algorithm SVTR_LCNet \
  --rec_model_path /path/to/hf_repo/ptocr_v5_korean_mobile_rec.safetensors \
  --rec_yaml_path  /path/to/hf_repo/configs/rec/PP-OCRv5/multi_language/korean_PP-OCRv5_mobile_rec.yml \
  --rec_image_shape "3,48,320" \
  --rec_char_dict_path /path/to/hf_repo/dicts/ppocrv5_korean_dict.txt \
  --use_gpu False

PaddleOCR2Pytorch base_ocr_v20.py auto-detects .safetensors vs .pth by extension (backward compatible).

Custom Python Inference

A minimal skeleton showing how to load the weights and run a forward pass. You still need the network definitions from the PaddleOCR2Pytorch pytorchocr/modeling/ package.

import sys, numpy as np, cv2, torch, yaml
from safetensors.torch import load_file

# Requires https://github.com/frotms/PaddleOCR2Pytorch on PYTHONPATH
sys.path.insert(0, "/path/to/PaddleOCR2Pytorch")
from pytorchocr.modeling.architectures.base_model import BaseModel
from pytorchocr.postprocess import build_post_process

HF_REPO = "/path/to/hf_repo"   # the path returned by snapshot_download
yml_path    = f"{HF_REPO}/configs/rec/PP-OCRv5/multi_language/korean_PP-OCRv5_mobile_rec.yml"
weight_path = f"{HF_REPO}/ptocr_v5_korean_mobile_rec.safetensors"

# 1. load config + dictionary
with open(yml_path, encoding="utf-8") as f:
    cfg = yaml.safe_load(f)
dict_path = cfg["Global"]["character_dict_path"]          # './dicts/ppocrv5_korean_dict.txt'
dict_abs  = f"{HF_REPO}/{dict_path.lstrip('./')}"
with open(dict_abs, encoding="utf-8") as f:
    chars = [l.strip("\n\r") for l in f]
n_char = len(chars) + 2                                   # +1 blank, +1 space (if use_space_char)

# 2. build network + load weights (safetensors = zero-code-exec, mmap-fast)
cfg["Architecture"]["Head"]["out_channels_list"] = {
    "CTCLabelDecode": n_char,
    "SARLabelDecode": n_char + 2,
    "NRTRLabelDecode": n_char + 3,
}
net = BaseModel(cfg["Architecture"], out_channels=n_char)
net.load_state_dict(load_file(weight_path, device="cpu"))
net.eval()

# 3. preprocess (resize to [3, 48, 320], normalize to [-1, 1])
img = cv2.imread("input_word.jpg")
h, w = img.shape[:2]
ratio = w / h
tw = min(int(48 * ratio), 320)
img = cv2.resize(img, (tw, 48))
canvas = np.zeros((48, 320, 3), dtype=np.uint8)
canvas[:, :tw] = img
x = canvas.astype(np.float32).transpose(2, 0, 1) / 255.0
x = (x - 0.5) / 0.5
x = torch.from_numpy(x).unsqueeze(0)

# 4. forward + CTC decode
with torch.no_grad():
    logits = net(x)
post_op = build_post_process({
    "name": "CTCLabelDecode",
    "character_dict_path": dict_abs,
    "use_space_char": True,
})
result = post_op(logits)
print("prediction:", result)          # e.g. [('λ°”νƒ•μœΌλ‘œ', 0.9998)]

Runtime Dependencies

torch        >= 1.13
safetensors  >= 0.4
numpy, pillow, opencv-python
pyyaml, shapely, pyclipper
scikit-image                # required by det post-processing

Conversion & Verification

  • Source weights: official PaddlePaddle .pdparams from
    https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/
  • Converter: PaddleOCR2Pytorch β€” scripts converter/ppocr_v5_det_converter.py / converter/ppocr_v5_rec_converter.py
  • Verification: end-to-end inference was run on macOS Apple Silicon (M-series) CPU; multilingual rec outputs are bit-exact with the original PaddleOCR .pdparams (float32 values match to 8 decimal places).

Sample inference results (CPU, < 0.7 s / image):

Sample Prediction Confidence
Chinese word_1.jpg ιŸ©ε›½ε°ι¦† 0.99797755
Korean korean/1.jpg λ°”νƒ•μœΌλ‘œ 0.99977183
French french/1.jpg de l'amendement, 0.99656343
Arabic arabic/ar_1.jpg Ψ§Ω„ΩƒΩŠΨ΅ΩŠΨ§ΩˆΩŠ 0.68281130

Legacy Files (legacy/)

Older PP-OCR (v2 / v3 / v4) checkpoints previously at the repo root have been moved into legacy/ for clarity. They are still present and continue to work β€” just add the legacy/ prefix to your path.

If you were previously using any of these URLs at the root:

legacy/ch_ptocr_mobile_v2.0_cls_infer.pth
legacy/ch_ptocr_v4_det_infer.pth
legacy/ch_ptocr_v4_rec_infer.pth
legacy/en_ptocr_v3_det_infer.pth
legacy/en_ptocr_v4_rec_infer.pth

The 15 PP-OCRv5 safetensors files remain at the repo root β€” their URLs did not change.


License & Credits

  • License: Apache License 2.0
  • Weights originate from PaddleOCR by the PaddlePaddle team (Apache 2.0).
  • Converted with PaddleOCR2Pytorch (Apache 2.0).

If this repo helps you, please also star both of those original projects.


Citation

@misc{pp_ocrv5_pytorch_joycn_2025,
  title        = {PP-OCRv5 PyTorch Model Zoo},
  author       = {JoyCN},
  howpublished = {\url{https://huggingface.co/JoyCN/PaddleOCR-Pytorch}},
  year         = {2025}
}
Downloads last month
146
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support