DuCa: Accelerating Diffusion Transformers with Dual Feature Caching

January 8, 2025 ยท View on GitHub

DuCa: Accelerating Diffusion Transformers with Dual Feature Caching

๐Ÿ”ฅ News

  • 2024/12/29 ๐Ÿš€๐Ÿš€ We release our work DuCa about accelerating diffusion transformers for FREE, which achieves nearly lossless acceleration of 2.50ร— on OpenSora! ๐ŸŽ‰ DuCa also overcomes the limitation of ToCa by fully supporting FlashAttention, enabling broader compatibility and efficiency improvements.
  • 2024/12/20 ๐Ÿ’ฅ๐Ÿ’ฅ Our ToCa has achieved nearly lossless acceleration of 1.51ร— on FLUX, feel free to check the latest version of our paper!
  • 2024/10/16 ๐Ÿค—๐Ÿค— Users with autodl accounts can now quickly experience OpenSora-ToCa by directly using our publicly available image!
  • 2024/10/12 ๐Ÿš€๐Ÿš€ We release our work ToCa about accelerating diffusion transformers for FREE, which achieves nearly lossless acceleration of 2.36ร— on OpenSora!
  • 2024/07/15 ๐Ÿค—๐Ÿค— We release an open-sourse repo "Awesome-Generation-Acceleration", which collects recent awesome generation accleration papers! Feel free to contribute your suggestions!

Dependencies

Python>=3.9
CUDA>=11.8

๐Ÿ›  Installation

git clone https://github.com/Shenyi-Z/DuCa.git

Environment Settings

We evaluated our model under the same environments as the original models. So you may set the environments through following the requirements of the mentioned original models.

Links:

Original Modelsurls
DiThttps://github.com/facebookresearch/DiT
PixArt-ฮฑhttps://github.com/PixArt-alpha/PixArt-alpha
OpenSorahttps://github.com/hpcaitech/Open-Sora

From our environment.yaml

Besides, we provide a replica for our environment here

DiT
cd DuCa-DiT
conda env create -f environment-dit.yml
PixArt-ฮฑ
cd DuCa-PixArt-alpha
conda env create -f environment-pixart.yml
OpenSora
cd DuCa-Open-Sora
conda env create -f environment-opensora.yml
pip install -v . # for development mode, `pip install -v -e .`

๐Ÿš€ Run and evaluation

Run DuCa-DiT

sample images for visualization

cd DuCa-DiT
python sample.py --image-size 256 --num-sampling-steps 50 --cache-type attention --fresh-threshold 3 --fresh-ratio 0.05 --ratio-scheduler ToCa  --force-fresh global --soft-fresh-weight 0.25 --ddim-sample

sample images for evaluation (e.g 50k)

cd DuCa-DiT
torchrun --nnodes=1 --nproc_per_node=6 sample_ddp.py --model DiT-XL/2 --per-proc-batch-size 150 --image-size 256 --cfg-scale 1.5 --num-sampling-steps 50 --cache-type attention --fresh-ratio 0.05 --ratio-scheduler ToCa --force-fresh global --fresh-threshold 3 --ddim-sample --soft-fresh-weight 0.25 --num-fid-samples 50000

Run DuCa-PixArt-ฮฑ

sample images for visualization

cd DuCa-PixArt-alpha
python scripts/inference.py --model_path /root/autodl-tmp/pretrained_models/PixArt-XL-2-256x256.pth --image_size 256 --bs 100 --txt_file /root/autodl-tmp/test.txt --fresh_threshold 3 --fresh_ratio 0.75 --cache_type attention --force_fresh global --soft_fresh_weight 0.25 --ratio_scheduler ToCa

sample images for evaluation (e.g 30k for COCO, 1.6k for PartiPrompts)

cd DuCa-PixArt-alpha
torchrun --nproc_per_node=6 scripts/inference_ddp.py --model_path /root/autodl-tmp/pretrained_models/PixArt-XL-2-256x256.pth --image_size 256 --bs 100 --txt_file /root/autodl-tmp/COCO/COCO_caption_prompts_30k.txt --fresh_threshold 3 --fresh_ratio 0.75 --cache_type attention --force_fresh global --soft_fresh_weight 0.25 --ratio_scheduler ToCa

Run DuCa-OpenSora

sample video for visualizaiton

cd DuCa-Open-Sora
python scripts/inference.py configs/opensora-v1-2/inference/sample.py  --num-frames 2s --resolution 480p --aspect-ratio 9:16   --prompt "a beautiful waterfall"

sample video for VBench evaluation

cd DuCa-Open-Sora
bash eval/vbench/launch.sh /root/autodl-tmp/pretrained_models/hpcai-tech/OpenSora-STDiT-v3/model.safetensors 51 opensora-ToCa 480p 9:16

( remember replacing "/root/autodl-tmp/pretrained_models/hpcai-tech/OpenSora-STDiT-v3/model.safetensors" with your own path!)

๐Ÿ‘ Acknowledgements

  • Thanks to DiT for their great work and codebase upon which we build DiT-DuCa.
  • Thanks to PixArt-ฮฑ for their great work and codebase upon which we build PixArt-ฮฑ-DuCa.
  • Thanks to OpenSora for their great work and codebase upon which we build OpenSora-DuCa.

๐Ÿ“Œ Citation

@article{zou2024DuCa,
  title={Accelerating Diffusion Transformers with Dual Feature Caching},
  author={Zou, Chang and Zhang, Evelyn and Guo, Runlin and Xu, Haohang and He, Conghui and Hu, Xuming and Zhang, Linfeng},
  journal={arXiv preprint arXiv:2412.18911},
  year={2024}
}

:e-mail: Contact

If you have any questions, please email shenyizou@outlook.com.