(CVPR 2025, Highlight) ShortDF: Optimizing for the Shortest Path in Denoising Diffusion Model
December 17, 2025 · View on GitHub
🧭 Description
This repository is the official implementation of ShortDF (CVPR 2025).
ShortDF acts as an "intelligent navigation system" for diffusion models. Instead of blindly following a fixed trajectory, it solves for the optimal path via Implicit Graph Modeling and Shortest-Path Relaxation. This allows a single step to achieve the efficacy of multiple steps.
Core Mechanism
- Path Optimization: We treat diffusion steps as nodes in a graph. If a multi-step path (e.g., $10 \to 2 \to 0) yields better quality than a direct step (\10 \to 0$), the model optimizes the direct step to match that higher quality.
- Error Propagation: Through iterative training, long paths (e.g., $100 \to 0$) absorb the refined information from intermediate steps, achieving fewer-step convergence comparable to the original multi-step process.
Highlights
- 5× Speedup: Achieves quality comparable to 10-step DDIM on CIFAR-10 in just 2 steps.
- Higher Fidelity: Improves FID by 18.5% on CIFAR-10.
- Robustness: Demonstrates superior performance on CelebA and LSUN-Church datasets across various sampling steps.
Figure 1: Extreme Speed Test on CIFAR-10. ShortDF achieves comparable quality at 2 steps, demonstrating a 5× speedup.
Figure 2: Multi-Dataset Performance and Sampling Trajectory (CelebA, Church). Note ShortDF's clear quality advantage across the sampling sequence (Step i).
For more details, please refer to our CVPR 2025 paper.
🚀 Running the Experiments
Training
Training follows the standard DDPM protocol.
python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --doc {MODEL_NAME} --ni
Loss Design & Strategy
The ShortDF-specific loss is implemented in ./functions/losses.py as shortdf_relax_loss. We recommend the following training strategies:
-
Two-stage training (Recommended):
- Phase 1: Train using standard noise loss (or load a pretrained DDPM checkpoint) to stabilize the model.
- Phase 2: Fine-tune with
shortdf_relax_lossto optimize for shortest-path residuals. - Benefit: Reduces training complexity and ensures stable convergence.
-
One-stage training (Optional):
- Train with both standard noise loss and
shortdf_relax_lossfrom scratch. - Configuration: Adjust
noise_weightandrelax_weightin the config file to balance the contributions based on your dataset and model size.
- Train with both standard noise loss and
Sampling
1. Download Pretrained Models
We provide pretrained models for the CIFAR-10, CelebA, and LSUN-Church datasets.
- Download Link: Google Drive
- Setup: After downloading, please place the model file in the following directory structure:
logs/{DATASET}/ckpt.pth
2. General Sampling (FID Evaluation)
To generate samples and evaluate the Fréchet Inception Distance (FID):
python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --doc {MODEL_NAME} --sample --fid --timesteps {STEPS} --eta {ETA} --ni
--eta: Controls the variance scale ( for DDIM, for DDPM).--timesteps: Specifies the number of diffusion steps ().--doc: Identifies the folder name containing the checkpoint.
Example (CIFAR-10):
python main.py --config cifar10.yml --exp ./ --doc cifar10 --sample --fid --timesteps 2 --eta 0 --ni --skip_type quad
python main.py --config cifar10.yml --exp ./ --doc cifar10 --sample --fid --timesteps 10 --eta 1 --ni --skip_type quad
Example (LSUN-Church):
python main.py --config church.yml --exp ./ --doc church --sample --fid --timesteps 20 --eta 1 --ni --skip_type uniform
Note:
FID scores are computed using the provided reference statistics in the
stats/directory, and are intended for relative comparison under a unified evaluation setting.When the number of steps increases, it poses a common risk of over-denoising, which is similar to other distillation schemes. In such cases, it is recommended to decrease the parameter to achieve better results.
⚙️ Requirements
- Python ≥ 3.9
- PyTorch ≥ 1.6
- Dependencies:
torchvision,numpy,tqdm
📖 References and Acknowledgements
@inproceedings{chen2025optimizing,
title={Optimizing for the Shortest Path in Denoising Diffusion Model},
author={Chen, Ping and Zhang, Xingpeng and Liu, Zhaoxiang and Hu, Huan and Liu, Xiang and Wang, Kai and Wang, Min and Qian, Yanlin and Lian, Shiguo},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={18021--18030},
year={2025}
}
This implementation is based on / inspired by:
- DDIM PyTorch repo (code structure).
- PyTorch-DDPM repo (accelerated FID evaluation).
🔮 Future Directions
We are extending ShortDF to text-to-image and multi-modal tasks. We encourage the community to explore more efficient training strategies based on this shortest-path paradigm.