
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
Document registration (also known as document alignment) aims to densely map two document images with the same content (such as a scanned and photographed version of the same document). It has important applications in automated data annotation and template-based dewarping tasks.
| Venue |
Method |
DocUNet (130) |
| MS-SSIMā |
ADā |
| Arxiv'23 |
DocAligner |
0.8232 |
0.0445 |
This line of research aims to tackle multiple document enhancement tasks simultaneously using a single unified model.
| Dataset | Num. (train/test) | Type | Example | Download |
|---|
| MixedDoc | 1837 (0/1837) | Synth | | Link |
Appearance enhancement (also known as illumination correction) is not limited to a specific degradation type and aims to restore a clean appearance similar to that obtained from a scanner or digital born PDF files.
| Venue |
Methods |
Training data |
DocUNet from DocAligner (130) |
RealDAE (150) |
| SSIM |
PSNR |
SSIM |
PSNR |
| - |
- |
- |
0.7195 |
13.09 |
0.8264 |
12.26 |
| TOG'19 |
DocProj |
DocProj |
0.7098 |
14.71 |
0.8684 |
19.35 |
| BMVC'20 |
Das et al. |
Doc3DShade |
0.7276 |
16.42 |
0.8633 |
19.87 |
| MM'21 |
DocTr |
DocProj |
0.7067 |
15.78 |
0.7925 |
18.62 |
| MM'22 |
UDoc-GAN |
DocProj |
0.6833 |
14.29 |
0.7558 |
16.43 |
| TAI'23 |
GCDRNet |
RealDAE |
0.7658 |
17.09 |
0.9423 |
24.42 |
| CVPR'24 |
DocRes |
|
0.7598 |
17.60 |
0.9219 |
24.65 |
| ACM MM'25 |
Uni-DocDiff |
|
0.7682 |
18.22 |
0.9485 |
24.97 |
Deshadowing aims to eliminate shadows that are mainly caused by occlusion to obtain shadow-free document images.
* indicates that the implementation is unofficial.
Dewarping, also referred to as geometric rectification, aims to rectify document images that suffer from curves, folds, crumples, perspective/affine deformation and other geometric distortions.
| Venue |
Method |
DocUNet (130) |
DIR300 (300) |
DocReal (200) |
UVDoc (50) |
| MS-SSIMā |
LDā |
ADā |
MS-SSIMā |
LDā |
ADā |
MS-SSIMā |
LDā |
ADā |
MS-SSIMā |
LDā |
ADā |
| ICCV'19 |
DewarpNet |
0.474 |
8.39 |
0.426 |
0.492 |
13.94 |
0.331 |
|
|
|
0.589 |
|
0.193 |
| DAS'20 |
FCN-based |
0.448 |
7.84 |
0.434 |
0.503 |
9.75 |
0.331 |
|
|
|
|
|
|
| ICCV'21 |
Piece-Wise |
0.492 |
8.64 |
0.468 |
|
|
|
|
|
|
|
|
|
| ICDAR'21 |
DDCP |
0.473 |
8.99 |
0.453 |
0.552 |
10.95 |
0.357 |
0.46 |
16.04 |
|
0.585 |
|
0.290 |
| MM'21 |
DocTr |
0.511 |
7.76 |
0.396 |
0.616 |
7.21 |
0.254 |
0.55 |
12.66 |
|
0.697 |
|
0.160 |
| CVPR'22 |
RDGR |
0.497 |
8.51 |
0.461 |
|
|
|
|
|
|
0.610 |
|
0.280 |
| MM'22 |
Marior |
0.478 |
7.27 |
0.403 |
|
|
|
|
|
|
|
|
|
| ECCV'22 |
DocGeoNet |
0.504 |
7.71 |
0.380 |
0.638 |
6.40 |
0.242 |
0.55 |
12.22 |
|
0.706 |
|
0.168 |
| SIGGRAPH'22 |
PaperEdge |
0.473 |
7.81 |
0.392 |
0.583 |
8.00 |
0.255 |
0.52 |
11.46 |
|
|
|
|
| Arxiv'22 |
DocScanner-L |
0.518 |
7.45 |
0.334 |
|
|
|
|
|
|
|
|
|
| ICCV'23 |
Li et al. |
0.497 |
8.43 |
0.376 |
0.607 |
7.68 |
0.244 |
|
|
|
|
|
|
| WACV'23 |
DocReal |
0.50 |
7.03 |
|
|
|
|
0.56 |
9.83 |
0.238 |
|
|
|
| TCSVT'23 |
DRNet |
0.51 |
7.42 |
|
|
|
|
|
|
|
|
|
|
| TMM'23 |
DocTr++ |
0.51 |
7.54 |
|
|
|
|
0.45 |
19.88 |
|
|
|
|
|
|
| Arxiv'23 |
Polar-Doc |
|
|
|
0.605 |
7.17 |
0.206 |
|
|
|
|
|
|
| Arxiv'23 |
MetaDoc |
0.502 |
7.42 |
0.315 |
0.638 |
5.75 |
0.178 |
|
|
|
|
|
|
| SIGGRAPH Asia'23 |
UVDoc |
0.544 |
6.83 |
0.315 |
|
|
|
|
|
|
0.785 |
|
0.119 |
| ACM TOG'23 |
LA-DocFlatten |
0.526 |
6.72 |
0.300 |
0.651 |
5.70 |
0.195 |
|
|
|
|
|
|
| CVPR'24 |
DocRes |
|
|
|
0.626 |
6.83 |
0.241 |
|
|
|
|
|
|
| IJDAR'24 |
DocTLNet |
0.51 |
6.70 |
|
0.658 |
5.75 |
|
|
|
|
|
|
|
| ICMM'25 |
DocMamba |
0.5292 |
7.07 |
0.3381 |
0.6319 |
6.57 |
0.1941 |
|
|
|
|
|
|
| ACM MM'25 |
Uni-DocDiff |
|
|
|
0.6573 |
5.30 |
0.203 |
|
|
|
|
|
|
| Arxiv'25 |
SalmRec |
0.51 |
7.10 |
0.310 |
0.67 |
5.14 |
0.178 |
0.59 |
8.41 |
0.229 |
|
|
|
| Arxiv'25 |
TADoc (average) |
0.530 |
7.11 |
0.334 |
0.692 |
4.33 |
0.170 |
0.593 |
9.47 |
0.246 |
|
|
|
| AAAI'26 |
Wang et al. |
0.543 |
6.249 |
0.278 |
0.702 |
4.261 |
0.131 |
|
|
|
|
|
|
| CVPR'26 |
D2Dewarp |
0.50 |
7.71 |
0.349 |
0.65 |
5.73 |
0.186 |
0.58 |
8.69 |
0.227 |
|
|
|
- Note that the 127th and 128th distorted images in DocUNet benchmark are rotated by 180 degrees, which does not match the ground truth documents. The performance reported here is based on corrected data.
- Note that the UVDoc benchmark reported in our repository is based on the full UVDoc benchmark dataset (reported on the official github page). The results in the paper used only half of the UVDoc benchmark.
Coming Soon ...
Coming Soon ...
This task aims to erase the handwritten text in the document image.
