README.md

April 6, 2022 · View on GitHub

Revisiting Document Image Dewarping by Grid Regularization

This repository contains the source code for our paper:

Revisiting Document Image Dewarping by Grid Regularization

CVPR 2022

Required Data

To evaluate/train our model, you will need to download the required data.

├── data
    ├── crop
├── result
    ├── grid
    ├── tfi
    ├── tps
    ├── text_line
    ├── text_line
    ├── vertical_line
├── datasets
    ├── doc3d
    	├── img
        ├── bm
        ├── uv
        ├── data.txt
    ├── dtd
        ├── images
    ├── textline
        ├── publaynet
            ├── train
            ├── mask        

Inference

Download the pretrained models from One Drive, and put them to pkl/. You can get a result using predict.py:

python predict.py --crop data/crop --method grid --docunet pkl/docunet.pth --unet pkl/unet.pth

Evalutaion

  • We use the same evaluation code as DocUNet Benchmark dataset on MS-SSIM (multi-scale SSIM) and LD (Local Distortion) based on Matlab 2018b (detail in test.m).

  • We use the same evaluation code as DewarpNet on CER (Chaacter Error Rate) and ED (Edit Distance).

    cd result;python test.py
    
  • We use the Tesseract (v4.0.0-beta.1) default configuration for evaluation with PyTesseract (v0.3.8).

Training

  • Train DocUNet Network to regress boundary points of the document: python train_b.py
  • Train UNet Network to segment text line in the document: python train_t.py
  • The final result depends on the accuracy of the detection of geometrical element.