README.md
May 15, 2025 · View on GitHub
Effective Diffusion Transformer Architecture for Image Super-Resolution
Yong Guo 3 Mingrui Zhu 1 Nannan Wang 1 Xinbo Gao 4 Jie Hu 2
3 CBG, Huawei 4 Chongqing University of Posts and Telecommunications
🔎 Introduction
We propose DiT-SR, an effective diffusion transformer for real-world image super resolution:
- Effective yet efficient architecture design;
- Adaptive Frequence Modulation (AdaFM) for time step.
⚙️ Dependencies and Installation
git clone https://github.com/kunncheng/DiT-SR.git
cd DiT-SR
conda create -n DiT_SR python=3.10 -y
conda activate DiT_SR
pip install -r requirements.txt
🌈 Training
Datasets
The training data comprises LSDIR, DIV2K, DIV8K, OutdoorSceneTraining, Flicker2K and the first 10K face images from FFHQ. We saved all the image paths to txt files. For simplicity, you can also just use the LSDIR dataset.
Pre-trained Models
Several checkpoints should be downloaded to weights folder, including autoencoder and other pre-trained models for loss calculation.
Training Scripts
Real-world Image Super-resolution
torchrun --standalone --nproc_per_node=8 --nnodes=1 main.py --cfg_path configs/realsr_DiT.yaml --save_dir ${save_dir}
Blind Face Restoration
torchrun --standalone --nproc_per_node=8 --nnodes=1 main.py --cfg_path configs/faceir_DiT.yaml --save_dir ${save_dir}
🚀 Inference and Evaluation
Real-world Image Super-resolution
Real-world datasets: RealSR, RealSet65; Synthetic datasets: LSDIR-Test; Pretrained checkpoints.
bash test_realsr.sh
Blind Face Restoration
Real-world datasets: LFW, WebPhoto, Wider; Synthetic datasets: CelebA-HQ; Pretrained checkpoints.
bash test_faceir.sh
For the synthetic datasets (LSDIR-Test and CelebA-HQ), we are unable to release them due to corporate review restrictions. However, you can generate them yourself using these scripts.
🎓 Citiation
If you find our work useful in your research, please consider citing:
@inproceedings{cheng2025effective,
title={Effective diffusion transformer architecture for image super-resolution},
author={Cheng, Kun and Yu, Lei and Tu, Zhijun and He, Xiao and Chen, Liyu and Guo, Yong and Zhu, Mingrui and Wang, Nannan and Gao, Xinbo and Hu, Jie},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={3},
pages={2455--2463},
year={2025}
}
❤️ Acknowledgement
We sincerely appreciate the code release of the following projects: ResShift, DiT, FFTFormer, SwinIR, SinSR, and BasicSR.