README.md

June 1, 2026 · View on GitHub

D2Dewarp

The code for "D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping", CVPR 2026.

Training Dataset

We propose a training set of distorted images with horizontal and vertical line annotations. You can find more details and download this dataset on DocDewarpHV.

Evaluation Dataset

We evaluate on three datasets DocUNet (130 images), DIR300 (300 images) and DocReal (200 images).

Training

Modify the data_path and related hyperparameters, and then execute the Python file:

python train.py --data_path /DATA/PATH --save_path /OUTPUT/DIR

or execute the script: bash train.sh /OUTPUT/DIR

Inference

Please download the pre-trained model from Google Drive or Baidu Cloud. Then execute:

python predict.py --model_path /MODEL/PATH --img_path /BENCHMARK/DIR --save_path /SAVE/PATH

Evaluation

We follow the evaluation environment and code in DocUNet and DocGeoNet.

For CER and ED metrics evaluation:

Tesseract==5.0.1.20220118 (Windows)
pytesseract==0.3.8

For DocReal (Chinese Benchmark), we used PaddleOCR's (Link) text detection model DBNet with “detv4_teacher_inference” and text recognition model SVTR_LCNet with “OCRv4_rec_server_infer”.

The dewarped images from our method can be downloaded from Google Drive or Baidu Cloud.

License

This work need to be referenced under CC BY-NC-ND 4.0 License for non-commercial research purposes.

Contact

If you have any questions about this work, you can always contact hengli.lh@outlook.com.

Thanks to Doc3D for open-sourcing the code. We also thanks to cddod, CDLA, M6Doc and PubLayNet for their outstanding work in open-sourcing the original document images. Thanks to DocTr, DocScanner, RDGR, DocRes, and others for their excellent work and open-source code.

Citation

If our methods and code are helpful to you, please refer to the following BibTeX format for citation:

@inproceedings{li2026d2dewarp,
  title={D2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image Dewarping},
  author={Li, Heng and Wu, Xiangping and Chen, Qingcai},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={34734--34744},
  year={2026}
}

@article{li2025dual,
  title={Dual Dimensions Geometric Representation Learning Based Document Dewarping},
  author={Li, Heng and Chen, Qingcai and Wu, Xiangping},
  journal={arXiv preprint arXiv:2507.08492},
  year={2025}
}