CRAFT: Character-Region Awareness For Text detection

November 20, 2024 ยท View on GitHub

Burn implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee.

Clova AI Research, NAVER Corp.

Adapted by Genna Wingert

Overview

Burn implementation for CRAFT text detector that effectively detect text area by exploring each character region and affinity between characters. The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

Adapted from CRAFT-pytorch

Polygon processing is not yet implemented.

Getting started

Training

The code for training is not included in this repository, as the original authors cannot release the full training code for IP reason.

Test instruction using pretrained model

  • Download the trained models (converted because originals use a legacy format)

    Model nameUsed datasetsLanguagesPurposeModel Link
    GeneralSynthText, IC13, IC17Eng + MLTFor general purposeClick
    IC15SynthText, IC15EngFor IC15 onlyClick
    LinkRefinerCTW1500-Used with the General ModelClick
  • Run with pretrained model

cargo run --example test-craft --release --trained_model=[weightfile] --test_image=[path to test image]

The result image and socre maps will be saved to ./result by default.

Arguments

  • --trained_model: pretrained model
  • --text_threshold: text confidence threshold
  • --low_text: text low-bound score
  • --link_threshold: link confidence threshold
  • --backend: backend to use for inference (default: wgpu)
  • --max_size: max image size for inference
  • --mag_ratio: image magnification ratio
  • --test_file: file path to input image
  • --refine: use link refiner for sentense-level dataset
  • --refiner_model: pretrained refiner model