README.md

June 1, 2026 Ā· View on GitHub

LOGO

šŸ“– Recommendations of Document Image Processing

This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.

šŸ”„ Contents

1. Registration

Document registration (also known as document alignment) aims to densely map two document images with the same content (such as a scanned and photographed version of the same document). It has important applications in automated data annotation and template-based dewarping tasks.

1.1 Papers

YearVenueTitleRepo
2023IJDARInv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarpingCode
2023ArxivDocAligner: Annotating real-world photographic document images by simply taking picturesCode
2024ACM MMDocument Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents
2024ICDARCoarse-to-Fine Document Image Registration for DewarpingCode

1.2 Datasets

DatasetNum. (train/test)TypeExampleDownload
DocAlign12K12K (10K/2K)SynthExampleLink

1.3 SOTA

Venue Method DocUNet (130)
MS-SSIM↑ AD↓
Arxiv'23 DocAligner 0.8232 0.0445

2. All in One

This line of research aims to tackle multiple document enhancement tasks simultaneously using a single unified model.

2.1 Papers

YearVenueTitleRepo
2024CVPRDocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode
2025ACM MMUni-DocDiff: A Unified Document Restoration Model Based on Diffusion
2026CVPRMMDIR: Multimodal Instruction-Driven Framework for Mixed-Degradation Document Image RestorationCode

2.2 Datasets

DatasetNum. (train/test)TypeExampleDownload
MixedDoc1837 (0/1837)SynthLink

3. Appearance Enhancement

Appearance enhancement (also known as illumination correction) is not limited to a specific degradation type and aims to restore a clean appearance similar to that obtained from a scanner or digital born PDF files.

3.1 Papers

YearVenueTitleRepo
2019ACM TOGDocument Rectification and Illumination Correction using a Patch-based CNNCode
2020BMVCIntrinsic Decomposition of Document Images In-the-wildCode
2021ICCVDewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression NetworksCode
2021ACM MMDocTr: Document Image Transformer for Geometric Unwarping and Illumination CorrectionCode
2022CVPRFourier Document Restoration for Robust Document Dewarping and Recognition
2022ACM MMUDoc-GAN: Unpaired Document Illumination Correction with Background Light PriorCode
2023TAIAppearance Enhancement for Camera-captured Document Images in the WildCode
2023ICCVWTemplate-guided Illumination Correction for Document Images with Imperfect Geometric ReconstructionCode
2023arxivDocStormer: Revitalizing Multi-Degraded Colored Document Images to Pristine PDF Versions
2024ICASSPEfficient Joint Rectification of Photometric and Geometric Distortions in Document Images
2024CVPRDocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode
2025arxivGL-PGENet: A Parameterized Generation Framework for Robust Document Image Enhancement

3.2 Datasets

DatasetNum. (train/test)TypeExampleDownload
Doc3DShade90KSynthExampleLink
DocProj2450SynthExampleLink
DocUNet from DocAligner130RealExampleLink
RealDAE600 (450/150)RealExampleLink
Inv3D25KSynthExampleLink

3.3 SOTA

Venue Methods Training data DocUNet from DocAligner (130) RealDAE (150)
SSIM PSNR SSIM PSNR
- - - 0.7195 13.09 0.8264 12.26
TOG'19 DocProj DocProj 0.7098 14.71 0.8684 19.35
BMVC'20 Das et al. Doc3DShade 0.7276 16.42 0.8633 19.87
MM'21 DocTr DocProj 0.7067 15.78 0.7925 18.62
MM'22 UDoc-GAN DocProj 0.6833 14.29 0.7558 16.43
TAI'23 GCDRNet RealDAE 0.7658 17.09 0.9423 24.42
CVPR'24 DocRes 0.7598 17.60 0.9219 24.65
ACM MM'25 Uni-DocDiff 0.7682 18.22 0.9485 24.97

4. Deshadow

Deshadowing aims to eliminate shadows that are mainly caused by occlusion to obtain shadow-free document images.

4.1 Papers

YearVenueTitleRepo
2018CVPRDocument Enhancement Using Visibility DetectionCode
2020CVPRBEDSR-Net A Deep Shadow Removal Network from a Single Document ImageCode*
2022ICPRDocument Shadow Removal with Foreground Detection Learning From Fully Synth ImagesCode
2022MERConShadow Removal for Documents with Reflective Textured Surface
2023ICASSPShaDocNet: Learning Spatial-Aware Tokens in Transformer for Document Shadow RemovalCode
2023ICASSPShadow Removal of Text Document Images Using Background Estimation and Adaptive Text Enhancement
2023ICASSPLP-IOANet: Efficient High Resolution Document Shadow Removal
2023Optical ReviewShadow removal from document image based on background estimation employing selective median filter and black-top-hat transform
2023CVPRDocument Image Shadow Removal Guided by Color-Aware BackgroundCode
2023arxivShaDocFormer: A Shadow-attentive Threshold Detector with Cascaded Fusion Refiner for document shadow removal
2023ICCVHigh-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing NetCode
2023SensorsSynthetic Document Images with Diverse Shadows for Deep Shadow Removal NetworksCode
2024AAAIDocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple DegradationsCode
2024CVPRDocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode
2024IJDARAm I readable? Transfer learning based document image rectification
2025ACM MMUni-DocDiff: A Unified Document Restoration Model Based on Diffusion

* indicates that the implementation is unofficial.

4.2 Datasets

DatasetNum. (train/test)TypeExampleDownload
RDD4916 (4371/545)RealExampleLink
Kligler et al.300RealExampleLink
FSDSRD14200SynthExampleLink
Jung et al.87RealExampleLink
OSR237RealExampleLink
WEZUT OCR176RealExampleLink
SD7K7620 (6479/760)RealExampleLink
SynDocDS50K (40K/5K)SynthLink

4.3 SOTA

Venue Method Training data Kligler et al. (300) Jung et al. (87) OSR (237) RDD (545) SD7K (760)
RMSE↓ PSNR↑ SSIM↑ RMSE↓ PSNR↑ SSIM↑ RMSE↓ PSNR↑ SSIM↑ RMSE↓ PSNR↑ SSIM↑ RMSE↓ PSNR↑ SSIM↑
CVPR'23 BGShadowNet RDD 5.377 29.17 0.948 2.219 37.58 0.983
ICCV'23 FSENet SD7K 10.60 28.98 0.93 17.56 23.60 0.85 10.00 28.67 0.96
CVPR'24 DocRes 27.14 0.900 23.02 0.908 21.64 0.937
ACM MM'25 Uni-DocDiff 28.56 0.9382 23.93 0.9156 21.48 0.9532

5. Dewarping

Dewarping, also referred to as geometric rectification, aims to rectify document images that suffer from curves, folds, crumples, perspective/affine deformation and other geometric distortions.

5.1 Papers

YearVenueTitleRepo
2018CVPRDocUNet: Document Image Unwarping via A Stacked U-Net
2019TOGDocument Rectification and Illumination Correction using a Patch-based CNNCode
2019ICCVDewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression NetworksCode
2020PRGeometric Rectification of Document Images using Adversarial Gated Unwarping Network
2020ECCVCan You Read Me Now? Content Aware Rectification using Angle Supervision
2020DASDewarping Document Image by Displacement Flow Estimation with Fully Convolutional NetworkCode
2021ACM MMDocTr: Document Image Transformer for Geometric Unwarping and Illumination CorrectionCode
2021ICCVEnd-to-end Piece-wise Unwarping of Document ImagesCode
2021ICDARDocument Dewarping with Control PointsCode
2022CVPRFourier Document Restoration for Robust Document Dewarping and Recognition
2022CVPRRevisiting Document Image Dewarping by Grid RegularizationCode
2022ACM MMMarior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild
2022SIGGRAPHLearning From Documents in the Wild to Improve Document UnwarpingCode
2022ECCVGeometric Representation Learning for Document Image RectificationCode
2022ECCVLearning an Isometric Surface Parameterization for Texture UnwrappingCode
2022ArxivDocScanner: Robust Document Image Rectification with Progressive LearningCode
2022ICPRDocument Image Rectification in Complex Scene Using Stacked Siamese Networks
2023ArxivGeometric Rectification of Creased Document Images based on Isometric Mapping
2023IJDARAdaptive Dewarping of Severely Warped Camera-captured Document Images Based on Document Map Generation
2023TMMDeep Unrestricted Document Image RectificationCode
2023ArxivNeural Document Unwarping using Coupled Grids
2023IJDARInv3D: A High-resolution 3D Invoice Dataset for Template-guided Single-image Document UnwarpingCode
2023ArxivMataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary
2023ICCVWTemplate-guided Illumination Correction for Document Images with Imperfect Geometric ReconstructionCode
2023ICCVForeground and Text-lines Aware Document Image RectificationCode
2023ACM TOGLayout-Aware Single-Image Document FlateningCode
2023WACVDocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point PredictionCode
2023TCSVTRethinking Supervision in Document Unwarping: A Self-consistent Flow-free Approach
2023SIGGRAPH AsiaUVDoc: Neural Grid-based Document UnwarpingCode
2023ArxivPolar-Doc: One-Stage Document Dewarping with Multi-Scope Constraints under Polar Representation
2024ICASSPEfficient Joint Rectification of Photometric and Geometric Distortions in Document Images
2024ICDARCoarse-to-Fine Document Image Registration for DewarpingCode
2024CVPRDocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode
2024IJDARAm I readable? Transfer learning based document image rectification
2024ACM MMDocument Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents
2025ICMMDocMamba: Robust Document Image Dewarping via Selective State Space Sequence Modeling
2025ICCASPVision Mamba-Based Approach for Incomplete Boundary Document Image Rectification
2025ArxivDocument Image Rectification Bases on Self-Adaptive Multitask Fusion
2025SIGGRAPH AsiaDvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinates-based Diffusion Model
2025CVPRWDocument Image Rectification using Stable Diffusion Transformer
2025ICCVForCenNet: Foreground-Centric Network for Document Image RectificationCode
2025ACM MMUni-DocDiff: A Unified Document Restoration Model Based on Diffusion
2025ArxivTADoc: Robust Time-Aware Document Image Dewarping
2026ArxivBookNet: Book Image Rectification via Cross-Page Attention Network
2026AAAIAxis-Aligned Document DewarpingCode
2026PRAFH-Net: An adaptive feature harmonization network for document image De-warping
2026TMMCascaded Robust Rectification for Arbitrary Document ImagesCode
2026CVPRD2Dewarp: Dual Dimensions Geometric Representation Learning Based Document Image DewarpingCode
2026ArixvTextFlow: Textline-guided Generic Document Image UnwarpingCode

5.2 Dataset

DatasetNum.TypeExampleDownload/Codes
DocUNet130RealExampleLink
Doc3D100KSynth-Link
DIW5KRealExampleLink
WarpDoc1020RealExampleLink
DIR300300RealExampleLink
Inv3D25KSynthExampleLink
Inv3DReal360RealExampleLink
DICP-Synth-Link
DIF-Synth-Link
Simulated Paper90KSynth-Link
DocReal200RealExampleLink
UVDoc20KSynthExampleLink
WarpDoc-R840Real
UDIR195RealExampleLink
Dataset from DocAligner4568RealExampleLink
Book3D56000Synth
Book100100Real
DocDewarpHV110KSynthLink
TextFlow1M1MSynthLink

5.3 SOTA

Venue Method DocUNet (130) DIR300 (300) DocReal (200) UVDoc (50)
MS-SSIM↑ LD↓ AD↓ MS-SSIM↑ LD↓ AD↓ MS-SSIM↑ LD↓ AD↓ MS-SSIM↑ LD↓ AD↓
ICCV'19 DewarpNet 0.474 8.39 0.426 0.492 13.94 0.331 0.589 0.193
DAS'20 FCN-based 0.448 7.84 0.434 0.503 9.75 0.331
ICCV'21 Piece-Wise 0.492 8.64 0.468
ICDAR'21 DDCP 0.473 8.99 0.453 0.552 10.95 0.357 0.46 16.04 0.585 0.290
MM'21 DocTr 0.511 7.76 0.396 0.616 7.21 0.254 0.55 12.66 0.697 0.160
CVPR'22 RDGR 0.497 8.51 0.461 0.610 0.280
MM'22 Marior 0.478 7.27 0.403
ECCV'22 DocGeoNet 0.504 7.71 0.380 0.638 6.40 0.242 0.55 12.22 0.706 0.168
SIGGRAPH'22 PaperEdge 0.473 7.81 0.392 0.583 8.00 0.255 0.52 11.46
Arxiv'22 DocScanner-L 0.518 7.45 0.334
ICCV'23 Li et al. 0.497 8.43 0.376 0.607 7.68 0.244
WACV'23 DocReal 0.50 7.03 0.56 9.83 0.238
TCSVT'23 DRNet 0.51 7.42
TMM'23 DocTr++ 0.51 7.54 0.45 19.88
Arxiv'23 Polar-Doc 0.605 7.17 0.206
Arxiv'23 MetaDoc 0.502 7.42 0.315 0.638 5.75 0.178
SIGGRAPH Asia'23 UVDoc 0.544 6.83 0.315 0.785 0.119
ACM TOG'23 LA-DocFlatten 0.526 6.72 0.300 0.651 5.70 0.195
CVPR'24 DocRes 0.626 6.83 0.241
IJDAR'24 DocTLNet 0.51 6.70 0.658 5.75
ICMM'25 DocMamba 0.5292 7.07 0.3381 0.6319 6.57 0.1941
ACM MM'25 Uni-DocDiff 0.6573 5.30 0.203
Arxiv'25 SalmRec 0.51 7.10 0.310 0.67 5.14 0.178 0.59 8.41 0.229
Arxiv'25 TADoc (average) 0.530 7.11 0.334 0.692 4.33 0.170 0.593 9.47 0.246
AAAI'26 Wang et al. 0.543 6.249 0.278 0.702 4.261 0.131
CVPR'26 D2Dewarp 0.50 7.71 0.349 0.65 5.73 0.186 0.58 8.69 0.227
  • Note that the 127th and 128th distorted images in DocUNet benchmark are rotated by 180 degrees, which does not match the ground truth documents. The performance reported here is based on corrected data.
  • Note that the UVDoc benchmark reported in our repository is based on the full UVDoc benchmark dataset (reported on the official github page). The results in the paper used only half of the UVDoc benchmark.

6. Deblur

6.1 Papers

YearVenueTitleRepo
2019NIPSSVDocNet: Spatially Variant U-Net for Blind Document Deblurring
2019MTADeepDeblur: text image recovery from blur to sharpcode
2020TPAMIDE-GAN: A Conditional Generative Adversarial Network for Document Enhancementcode
2021ICCVEnd-to-End Unsupervised Document Image Blind Denoising
2023ACM MMDocDiff: Document Enhancement via Residual Diffusion ModelscDiffcode
2024AAAIDocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple DegradationsCode
2024CVPRDocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode
2024ArxivNAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document EnhancementCode

6.2 Datasets

DatasetNum. (train/test)TypeExampleDownload
TDD (text deblur dataset)67.6K (66K/1.6K)SynthExampleLink1, Link2

6.3 SOTA

Coming Soon ...

7. Binarization

7.1 Papers

YearVenueTitleRepo
2019PRDeepOtsu: Document enhancement and binarization using iterative deep learningcode
2021PRComplex image processing with less data—Document image binarization by integrating multiple pre-trained U-Net modulescode
2022PRTwo-Stage Generative Adversarial Networks for Binarization of Color Document Imagescode
2023PRGDB: Gated Convolutions-based Document Binarizationcode
2023ACM MMDocDiff: Document Enhancement via Residual Diffusion ModelscDiffcode
2023ICDARColDBin: Cold Diffusion for Document Image Binarizationcode
2023IFA Novel Degraded Document Binarization Model through Vision Transformer Network
2023ArxivDocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization
2024AAAIDocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple DegradationsCode
2024CVPRDocRes: A Generalist Model Toward Unifying Document Image Restoration TasksCode
2024ArxivNAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document EnhancementCode

7.2 Datasets

DatasetNum.TypeExampleDownload
DocEng 201915RealExampleLink
DocEng 202032RealExampleLink
DocEng 2021222RealExampleLink
DocEng 202280RealExampleLink
DIBCO 200910RealExampleLink
H-DIBCO 201010RealExampleLink
DIBCO 201116RealExampleLink
H-DIBCO 201214RealExampleLink
DIBCO 201316RealExampleLink
H-DIBCO 201410RealExampleLink
H-DIBCO 201610RealExampleLink
DIBCO 201720RealExampleLink
DIBCO 201810RealExampleLink
DIBCO 201910RealExampleLink
Bickly-diary7RealExampleLink
Synchromedia Multispectral (MSI)240RealExampleLink
Persian Heritage Image Binarization (PHIBD)15RealExampleLink
Palm Leaf50RealExampleLink
NoiseOffice216SynthExampleLink
LRDE Document Binarization Dataset125Real-Link
Shipping label dataset1082RealExampleLink

7.3 SOTA

Coming Soon ...

8. Handwritten Text Erasure

This task aims to erase the handwritten text in the document image.

8.1 Papers

YearVenueTitleRepo
2022PRCVCHENet: Image to Image Chinese Handwriting Eraser
2023ICDAREnsExam: A Dataset for Handwritten Text Erasure on Examination PapersCode
2024IJDARScene handwritten text erasure based on multi-scale feature fusion

8.2 Datasets

DatasetNum. (train/test)TypeExampleDownload
ē™¾åŗ¦ē½‘ē›˜AIå¤§čµ›ļ¼šę‰‹å†™ę–‡å­—ę“¦é™¤1281 (1081/200)RealExampleLink
CH-dataset1623 (1423/200)Real
EnsExam545 (430/115)RealExampleLink
SignaTR6K6257 (5169/558/530)RealExampleLink

9. APP & Project & Tool

APP & Project & ToolDeveloperPlatform
CamScanner (ę‰«ęå…Øčƒ½ēŽ‹)INTSIGios, Android
Quark (å¤øå…‹ę‰«ęēŽ‹)Dongyueios, Android, Web
WPS OfficeKINGSOFT OFFICEios, Android
Adobe AcrobatAdobeWindows
Adobe ScanAdobeios, Android
Lenovo Smart Scanner (č”ęƒ³ę‰«ęēŽ‹)Lenovoios, Android

⭐ Star Rising

Star Rising