README.md

March 10, 2026 · View on GitHub

DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model (SIGGRAPH Asia 2025)

We present DvD, the first diffusion model for document dewarping. Unlike the existing paradigms, DVD can yield a precise yet faithful document through a novel mapping generation paradigm, where we operate coordinate-level denoising to generate coordinate mappings. We further introduce a large-scale and fine-grained benchmark, AnyPhotoDoc6300, enabling more comprehensive evaluation.

https://github.com/user-attachments/assets/a2c5bca3-1393-410f-b870-8e69142bc635

🍉 Online Demo

You can have a quick trial through our Online Demo Online Demo deployed in HuggingFace

❤ Quick Start

Before running the script, install the following dependencies:

pip install -r requirements.txt

How to play

To run the DVD model as shown above:

Download link of Pretrained DvD model

https://drive.google.com/drive/folders/1RBt9t_5igAlN1BlQAkVLwJ_rZXITy_pN?usp=sharing

All of the weight files should stored in folder "./checkpoints", like below tree structure:

.checkpoints
|-- model1852000.pt
|-- line_model2.pth
|-- seg.pth
|-- seg_model.pth

Inference code

python run_sampling.py \
  --train_module 'dvd' 
  --train_name 'val_TDiff' --name "save_filename"

Training code

mpiexec -n 1 python run_training.py \  
  --train_module 'dvd' 
  --train_name 'train_TDiff'

a training sample format for reference, You need to organize the data into this format to facilitate training.

|-- 000000_1
| |-- img.png
| |-- wc.exr
| |-- uv.exr
| |-- recon.png
| |-- alb.png
| |-- dmap.exr
| |-- norm.exr
| `-- bm.mat

📝Download link of inference results in DocUNet and DIR300 benchmarks

Google Drive

https://drive.google.com/drive/folders/1WNMcXx4OApIy789F-iO3X2dJmmDUK_QI?usp=drive_link

📝 Download link of AnyPhotoDoc6300 benchmark dataset

HuggingFace

https://drive.google.com/drive/folders/1NJZtaTh4erKFcXDcs1t1JyF2hc1EBOA3?usp=drive_link

📄 Usage Guide — Detailed description of benchmark

🚀Citation

If you use this model in your work, please cite the following paper:

@inproceedings{zhang2025dvd,
author = {Weiguang Zhang and Huangcheng Lu and Maizhen Ning and Xiaowei Huang and Wei Wang and Kaizhu Huang and Qiufeng Wang},
title = {DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model},
year = {2025},
publisher = {Association for Computing Machinery},
doi = {https://doi.org/10.1145/3757377.3763913},
booktitle = {SIGGRAPH Asia 2025 Conference Papers},
series = {SA '25}
}

❤❤ Acknowledgements

We sincerely thank the following projects, since our code is largely based on inv3d, Doc3D, FTA, DocTr, DewarpNet, DiT