readme.md

April 7, 2023 ยท View on GitHub

News

  • PSENet is included in MMOCR.
  • We have upgraded PSENet from python2 to python3. You can find the old version here.
  • We have implemented PSENet using Paddle. Visit it here.
  • You can find code of PAN here.
  • Another group also implemented PSENet using Paddle. You can visit it here. You can also have a try online with all the environment ready here.

Introduction

Official Pytorch implementations of PSENet [1].

[1] W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao. Shape robust text detection with progressive scale expansion network. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 9336โ€“9345, 2019.

Python 3.6+
Pytorch 1.1.0
torchvision 0.3
mmcv 0.2.12
editdistance
Polygon3
pyclipper
opencv-python 3.4.2.17
Cython

Install

pip install -r requirement.txt
./compile.sh

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py ${CONFIG_FILE}

For example:

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/psenet/psenet_r50_ic15_736.py

Test

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}

For example:

python test.py config/psenet/psenet_r50_ic15_736.py checkpoints/psenet_r50_ic15_736/checkpoint.pth.tar

Speed

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

For example:

python test.py config/psenet/psenet_r50_ic15_736.py checkpoints/psenet_r50_ic15_736/checkpoint.pth.tar --report_speed

Evaluation

Introduction

The evaluation scripts of ICDAR 2015 (IC15), Total-Text (TT) and CTW1500 (CTW) datasets.

ICDAR 2015

Text detection

./eval_ic15.sh

Total-Text

Text detection

./eval_tt.sh

CTW1500

Text detection

./eval_ctw.sh

Benchmark

Results

ICDAR 2015

MethodBackboneFine-tuningScaleConfigPrecision (%)Recall (%)F-measure (%)Model
PSENetResNet50NShorter Side: 736psenet_r50_ic15_736.py83.674.078.5Releases
PSENetResNet50NShorter Side: 1024psenet_r50_ic15_1024.py84.476.380.2Releases
PSENet (paper)ResNet50NLonger Side: 2240-81.579.780.6-
PSENetResNet50YShorter Side: 736psenet_r50_ic15_736_finetune.py85.376.880.9Releases
PSENetResNet50YShorter Side: 1024psenet_r50_ic15_1024_finetune.py86.279.482.7Releases
PSENet (paper)ResNet50YLonger Side: 2240-86.984.585.7-

CTW1500

MethodBackboneFine-tuningConfigPrecision (%)Recall (%)F-measure (%)Model
PSENetResNet50Npsenet_r50_ctw.py82.676.479.4Releases
PSENet (paper)ResNet50N-80.675.678-
PSENetResNet50Ypsenet_r50_ctw_finetune.py84.579.281.8Releases
PSENet (paper)ResNet50Y-84.879.782.2-

Total-Text

MethodBackboneFine-tuningConfigPrecision (%)Recall (%)F-measure (%)Model
PSENetResNet50Npsenet_r50_tt.py87.377.982.3Releases
PSENet (paper)ResNet50N-81.875.178.3-
PSENetResNet50Ypsenet_r50_tt_finetune.py89.379.684.2Releases
PSENet (paper)ResNet50Y-84.078.080.9-

Citation

@inproceedings{wang2019shape,
  title={Shape robust text detection with progressive scale expansion network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

License

This project is developed and maintained by IMAGINE Lab@National Key Laboratory for Novel Software Technology, Nanjing University.

IMAGINE Lab

This project is released under the Apache 2.0 license.