ADCD-Net

March 29, 2026 ยท View on GitHub

arXiv Venue License

ADCD-Net addresses the challenging problem of document image forgery localization by leveraging adaptive DCT features alongside hierarchical content disentanglement to robustly detect tampered regions even under compression distortions.

model_overview


๐Ÿ“Š ForensicHub Benchmark (Doc Protocol)

doc_protocol

Evaluation follows the Doc Protocol: train on the DocTamper training set, evaluate on seven test sets. DocTamper FCD/SCD/Test sets are compressed once using the official DocTamper pickle QFs. Authentic images are skipped.

For more details, see ForensicHub โ€” Doc Protocol.


โš™๏ธ Environment Setup

DependencyVersion
Python3.10.13
PyTorch2.3.0+cu121
albumentations2.0.8

๐Ÿ“‚ Data Preparation

1. Download DocTamper Data

Download the DocTamper dataset (Training, Testing, FCD, SCD) from: ๐Ÿ‘‰ DocTamper GitHub

qt_table.pk and pks (JPEG record pickle files) are available in the DocTamper repository.

2. Download ADCD-Net Checkpoints & OCR Masks

Download from Google Drive: ๐Ÿ‘‰ ADCD-Net Data (Google Drive)

The archive contains:

ADCDNet.pth          # ADCD-Net model checkpoint
docres.pkl           # DocRes backbone checkpoint
DocTamperOCR/        # Pre-generated OCR mask directory
โ”œโ”€โ”€ TrainingSet/     # Training set OCR masks
โ”œโ”€โ”€ TestingSet/      # Testing set OCR masks
โ”œโ”€โ”€ FCD/             # FCD dataset OCR masks
โ””โ”€โ”€ SCD/             # SCD dataset OCR masks

3. Download Doc Protocol Cross-Domain Test Sets

Download the 4 cross-domain test sets (T-SROIE, OSTF, TPIC-13, RTM) from: ๐Ÿ‘‰ Doc Protocol Data (Google Drive) โ€” cutted_datasets_fakes.zip


๐Ÿ”ค Get OCR Masks

OCR character segmentation masks are generated using seg_char.py, which requires PaddleOCR.

Install PaddlePaddle and PaddleOCR by following the official guide: ๐Ÿ‘‰ PaddleOCR Installation

Then run:

python seg_char.py

๐Ÿš€ Training

ADCD-Net is trained on 4 ร— NVIDIA GeForce RTX 4090 (24 GB) with:

  • 100k training steps
  • Batch size: 40 (10 per GPU ร— 4 GPUs with gradient accumulation)
  • Training time: ~27 hours

Steps:

  1. Configure paths in cfg.py:
mode = 'train'
root = 'path/to/DocTamper'
docres_ckpt_path = 'path/to/docres.pkl'
  1. Launch training:
python main.py

๐Ÿ“ˆ Evaluation

Reproduce the ForensicHub Doc Protocol results with the following steps:

  1. Generate OCR masks for the 4 cross-domain sets using seg_char.py.
  2. Generate path pickle files for the 4 sets using build_path_pkl.py.
  3. Configure cfg.py for evaluation:
mode = 'val'
all_ds_name = ['TestingSet', 'FCD', 'SCD', 'T-SROIE_test', 'Tampered-IC13_test', 'RealTextManipulation_test', 'OSTF_test']
pkl_dir = 'path/to/path_pkl'
  1. Run evaluation:
python main.py

๐Ÿ“ Citation

If you find this work useful in your research, please consider citing:

@inproceedings{wong2025adcd,
  title={ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement},
  author={Wong, Kahim and Zhou, Jicheng and Wu, Haiwei and Si, Yain-Whar and Zhou, Jiantao},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2025}
}