README.md
March 26, 2025 路 View on GitHub
Dynamic Diffusion Transformer
馃敟馃敟馃敟 2025.03.26 update: The enhanced version of our method (including training code) is now available at in https://github.com/alibaba-damo-academy/DyDiT/tree/main 馃敟馃敟馃敟
The official implementation of "2024 Dynamic Diffusion Transformer".
Wangbo Zhao1, Yizeng Han2, Jiasheng Tang2,3, Kai Wang1, Yibing Song2,3, Gao Huang4, Fan Wang2, Yang You1
1National University of Singapore, 2DAMO Academy, Alibaba Group, 3Hupan Lab, 4Tsinghua University
https://github.com/user-attachments/assets/44ef5f81-cfe0-4e59-b228-14cc0729f5c6
We compare the generation speed of original DiT and the proposed DyDiT with on a NVIDIA V100 32G GPU.
Images generated by DyDiT with .

Abstract: Diffusion Transformer (DiT), an emerging diffusion model for image generation, has demonstrated superior performance but suffers from substantial computational costs. Our investigations reveal that these costs stem from the static inference paradigm, which inevitably introduces redundant computation in certain diffusion timesteps and spatial regions. To address this inefficiency, we propose Dynamic Diffusion Transformer (DyDiT), an architecture that dynamically adjusts its computation along both timestep and spatial dimensions during generation. Specifically, we introduce a Timestep-wise Dynamic Width (TDW) approach that adapts model width conditioned on the generation timesteps. In addition, we design a Spatial-wise Dynamic Token (SDT) strategy to avoid redundant computation at unnecessary spatial locations. Extensive experiments on various datasets and different-sized models verify the superiority of DyDiT. Notably, with <3% additional fine-tuning iterations, our method reduces the FLOPs of DiT-XL by 51%, accelerates generation by 1.73, and achieves a competitive FID score of 2.07 on ImageNet.
馃殌 News
2025.03.26:The enhanced version of our method (including training code) is now available at in https://github.com/alibaba-damo-academy/DyDiT/tree/main2025.01.23:DyDiT is accepted by ICLR 2025!!! We will update the code and paper soon.2024.12.19:We release the code for inference.2024.10.04:Our paper is released.
馃幆 TODO
-
Release the code for inference.
-
Release the code for training.
-
Release the code for applying our method to additional models (e.g., U-ViT, SiT).
-
Release the code for applying our method to text-to-image and text-to-video generation diffusion models.
馃挜 Overview
(a) The loss difference between DiT-S and DiT-XL across all diffusion timesteps (T = 1000). The difference is slight at most timesteps.
(b) Loss maps (normalized to the range [0, 1]) at different timesteps, show that the noise in different patches has varying levels of difficulty to predict.
(c) Difference of the inference paradigm between the static DiT and the proposed DyDiT
Overview of the proposed dynamic diffusion transformer (DyDiT). It reduces the
computational redundancy in DiT from both timestep and spatial dimensions.
馃敤 Install
We provide an environment.yml file to help create the Conda environment in our experiments. Other environments may also works well.
git clone https://github.com/NUS-HPC-AI-Lab/Dynamic-Diffusion-Transformer.git
conda env create -f environment.yml
conda activate DyDiT
鈿欙笍 Inference
Currently, we provide a pre-trained checkpoint of DyDiT .
| model | FLOPs (G) | FID | download |
|---|---|---|---|
| DiT | 118.69 | 2.27 | - |
| DyDiT | 84.33 | 2.12 | 馃 |
| DyDiT | - | - | in progress |
Run sample_0.7.sh to sample images and evaluate the performance.
bash sample_0.7.sh
The sample_ddp.py script which samples 50,000 images in parallel. It generates a folder of samples as well as a .npz file which can be directly used with ADM's TensorFlow evaluation suite to compute FID, Inception Score and other metrics. Please follow its instructions to download the reference batch VIRTUAL_imagenet256_labeled.npz.
馃 Cite DyDiT
If you found our work useful, please consider citing us.
@article{zhao2024dynamic,
title={Dynamic diffusion transformer},
author={Zhao, Wangbo and Han, Yizeng and Tang, Jiasheng and Wang, Kai and Song, Yibing and Huang, Gao and Wang, Fan and You, Yang},
journal={arXiv preprint arXiv:2410.03456},
year={2024}
}
鈽庯笍 Contact
If you're interested in collaborating with us, feel free to reach out via email at wangbo.zhao96@gmail.com.