๐จ Awesome Diffusion Acceleration Cache ๐
November 4, 2025 ยท View on GitHub
๐จ Awesome Diffusion Acceleration Cache ๐
๐๐ If you would like to contribute to this repository, feel free to email me at ljc.mytcl@gmail.com! ๐๐
๐ Repository Description
A curated list of research papers, resources, and advancements on Diffusion Cache and related efficient diffusion model acceleration techniques.
This repository aims to provide a comprehensive and up-to-date collection of academic works focused on Diffusion Cache โ a promising approach for accelerating diffusion models by caching intermediate features or latent states. It includes papers on model efficiency, memory optimization, reuse mechanisms, and inference speed-up in diffusion-based generative systems.
This repository is maintained by EPIC Lab at Shanghai Jiao Tong University
๐ฅ Update News
-
2025/03/10๐ฅ๐ฅ We propose TaylorSeer, achieving ~5ร acceleration for DiTs with Taylor expansion-based feature forecasting! -
2024/12/24๐ฅ๐ฅ Our work DuCa has been accepted by ICLR 2025! Congratulations to all collaborators! -
2024/10/12๐๐ We release our work ToCa, achieving nearly lossless acceleration of 1.51ร on FLUX, 1.93ร on PixArt-ฮฑ, and 2.36ร on OpenSora! -
2024/08/24๐ค๐ค We release an open-source repo for diffusion model acceleration and caching techniques!
๐ Contents
๐ฌ Keywords
๐ Papers
2023
-
[1] Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference, AAAI 2024.
Yu, Zihao and Li, Haoyang and Fu, Fangcheng and Miao, Xupeng and Cui, Bin.
-
[2] DeepCache: Accelerating Diffusion Models for Free, CVPR 2024.
Ma, Xinyin and Fang, Gongfan and Wang, Xinchao.
-
[3] Cache Me if You Can: Accelerating Diffusion Models through Block Caching, CVPR 2024.
Wimbauer, Felix and Wu, Bichen and Schoenfeld, Edgar and Dai, Xiaoliang and others.
[Paper]
-
[4] Approximate Caching for Efficiently Serving Diffusion Models, NSDI 2024.
Adobe Research Team.
[Paper]
-
[5] Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference, NeurIPS 2024.
Li, Senmao and Hu, Taihang and Khan, Fahad Shahbaz and Li, Linxuan and Yang, Shiqi and others.
2024
-
[6] Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models, arXiv 2024.
Liu, Haozhe and Zhang, Wentian and Xie, Jinheng and others.
-
[7] Faster Diffusion via Temporal Attention Decomposition, TMLR 2024.
Liu, Haozhe and Zhang, Wentian and Xie, Jinheng and Faccio, Francesco and others.
-
[8] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching, NeurIPS 2024.
Ma, Xinyin and Fang, Gongfan and Mi, Michael Bi and Wang, Xinchao.
-
[9] โ-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers, arXiv 2024.
Chen, Pengtao and Shen, Mingzhu and Ye, Peng and Cao, Jianjian and others.
[Paper]
-
[10] DiTFastAttn: Attention Compression for Diffusion Transformer Models, NeurIPS 2024.
Yuan, Zhihang and Lu, Pu and Zhang, Hanling and Ning, Xuefei and others.
-
[11] Efficient Inference of Vision Instruction-Following Models with Elastic Cache, ECCV 2024.
Liu, Zuyan and others.
-
[12] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration, arXiv 2024.
Selvaraju, Pratheba and Ding, Tianyu and Chen, Tianyi and Zharkov, Ilya and others.
-
[13] Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions, arXiv 2024.
University of Western Australia Research Team.
[Paper]
-
[14] Token Caching for Diffusion Transformer Acceleration, arXiv 2024.
Jinming Lou and Wenyang Luo and Yufan Liu and Bing Li and others.
[Paper]
-
[15] FRDiff: Feature Reuse for Universal Training-free Acceleration of Diffusion Models, ECCV 2024.
Lee, Jungwon and Park, Suhyeon and others.
-
[16] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality, arXiv 2024.
Zhengyao Lv and Chenyang Si and Junhao Song and Zhenyu Yang and others.
-
[17] ToCa: Accelerating Diffusion Transformers with Token-wise Feature Caching, ICLR 2025.
Chang Zou and Xuyang Liu and Ting Liu and Siteng Huang and Linfeng Zhang.
-
[18] HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration, ICML 2025.
SenseTime Research Team.
-
[19] Adaptive Caching for Faster Video Generation with Diffusion Transformers, arXiv 2024.
Kumara Kahatapitiya and Haozhe Liu and Sen He and Ding Liu and others.
-
[20] Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model, CVPR 2025.
Liu, Feng and Zhang, Shiwei and Wang, Xiaofeng and Wei, Yujie and others.
-
[21] LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers, AAAI 2025.
Rice, Shawn and others.
-
[22] Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing, ICML 2025.
Gao, Kaifeng and Shi, Jiaxin and Zhang, Hanwang and others.
-
[23] SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers, CVPR eLVM 2024.
Roblox Research Team.
-
[24] Accelerating Diffusion Transformers with Dual Feature Caching, ICLR 2025.
Chang Zou and Evelyn Zhang and Runlin Guo and Haohang Xu and Conghui He and others.
2025
-
[25] Fastest HunyuanVideo Inference with Context Parallelism and First Block Cache on NVIDIA L20 GPUs, arXiv 2025.
Zheng, Zeyi and others.
-
[26] FlexCache: Flexible Approximate Cache System for Video Diffusion, arXiv 2025.
University of Waterloo Research Team.
[Paper]
-
[27] Token Pruning for Caching Better: 9 Times Acceleration on Stable Diffusion for Free, arXiv 2025.
Evelyn Zhang and Bang Xiao and Jiayi Tang and Qianli Ma and Chang Zou and others.
[Paper]
-
[28] Accelerating Diffusion Transformer via Error-Optimized Cache, arXiv 2025.
University of Science and Technology of China Research Team.
-
[29] Real-Time Video Generation with Pyramid Attention Broadcast, ICLR 2025.
Xuanlei Zhao and Xiaolong Jin and Kai Wang and Yang You.
-
[30] MoDM: Efficient Serving for Image Generation via Mixture-of-Diffusion Models, arXiv 2025.
University of Michigan Research Team.
-
[31] BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers, CVPR 2025.
Fudan University Research Team.
[Paper]
-
[32] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers, ICCV 2025.
Chang Zou and others.
-
[33] FEB-Cache: Frequency-Guided Exposure Bias Reduction for Enhancing Diffusion Transformer Caching, arXiv 2025.
Xu, Haohang and Zou, Chang and Liu, Xuyang and Guo, Runlin and He, Conghui and others.
-
[34] CacheQuant: Comprehensively Accelerated Diffusion Models, CVPR 2025.
Zhang, Evelyn and Tang, Jiayi and others.
-
[35] QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation, ICCV 2025.
Wu, Junyi and Zou, Chang and Liu, Xuyang and Guo, Runlin and Xu, Haohang and others.
-
[36] AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse, arXiv 2025.
Xu, Haohang and Zou, Chang and Liu, Xuyang and Guo, Runlin and He, Conghui and others.
[Paper]
-
[37] ProfilingDiT: Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models, arXiv 2025.
Huazhong University of Science and Technology Research Team.
-
[38] Compute only 16 tokens in one timestep: Accelerating Diffusion Transformers with Cluster-Driven Feature Caching, ACM MM 2025.
Zheng, Zhixin and Zou, Chang and Liu, Xuyang and others.
-
[39] SpeCa: Accelerating Diffusion Transformers with Speculative Feature Caching, ACM MM 2025.
Zou, Chang and Liu, Xuyang and Guo, Runlin and others.
-
[40] Accelerate Diffusion Transformers with Feature Momentum, arXiv 2025.
Liu, Xuyang and Zou, Chang and Guo, Runlin and others.
-
[41] Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation, NeurIPS 2025.
The University of British Columbia and d-Matrix Research Team.
-
[42] ParaStep: Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism, arXiv 2025.
Shanghai Jiao Tong University Research Team.
[Paper]
-
[43] RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy, arXiv 2025.
Huawei Technologies Co., Ltd Research Team.
[Paper]
-
[44] Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition, CVPR 2025.
Qiu, Jianxin and others.
-
[45] FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation, CVPR 2025.
Liu, Noak and others.
-
[46] Chipmunk: Training-Free Acceleration of Diffusion, ICML 2025.
University of California Research Team.
-
[47] ECAD: Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model, arXiv 2025.
University of Maryland Research Team.
-
[48] MagCache: Fast Video Generation with Magnitude-Aware Cache, arXiv 2025.
Peking University Research Team.
-
[49] DBPrune: Dynamic Block Prune with Residual Caching, arXiv 2025.
Vipshop Research Team.
-
[50] DBCache: Dual Block Caching for Diffusion Transformers, arXiv 2025.
Vipshop Research Team.
-
[51] Block-wise Adaptive Caching for Accelerating Diffusion Policy, arXiv 2025.
Zhang, Yunbo and Liu, Zhenyu and others.
[Paper]
-
[52] Accelerating Vision Diffusion Transformers with Skip Branches, ICCV 2025.
Chen, Guanjie and Zhao, Xinyu and Zhou, Yucheng and Chen, Tianlong and Yu, Cheng.
-
[53] Forecast then Calibrate: Feature Caching as ODE for Efficient Diffusion Transformers, arXiv 2025.
Zou, Chang and Liu, Xuyang and Guo, Runlin and others.
-
[54] HiCache: Training-free Acceleration of Diffusion Models via Hermite Polynomial-based Feature Caching, arXiv 2025.
Guo, Runlin and Zou, Chang and Liu, Xuyang and others.
-
[55] WaveEx: Accelerating Flow Matching-based Speech Generation via Wavelet-guided Extrapolation, arXiv 2025.
Guo, Runlin and Zou, Chang and Liu, Xuyang and others.
-
[56] EasyCache: Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching, arXiv 2025.
Huazhong University of Science and Technology Research Team.
-
[57] Accelerating Diffusion Transformer via Gradient-Optimized Cache, ICCV 2025.
Qiu, Jianxin and others.
-
[58] TaoCache: Structure-Maintained Video Generation Acceleration, arXiv 2025.
Huawei Research Team.
[Paper]
-
[59] DiCache: Let Diffusion Model Determine Its Own Cache, arXiv 2025.
Shanghai Jiao Tong University Research Team.
-
[60] HERO: Hierarchical Extrapolation and Refresh for Efficient World Model, arXiv 2025.
Tsinghua University Research Team.
[Paper]
-
[61] ERTACache: Error Rectification and Timesteps Adjustment for Efficient Diffusion, arXiv 2025.
ByteDance Research Team.
-
[62] OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models, ICCV 2025.
Zhipu AI Research Team.
[Paper]
-
[63] MixCache: Mixture-of-Cache for Video Diffusion Transformer Acceleration, arXiv 2025.
Sun Yat-sen University Research Team.
[Paper]
-
[64] Z-Cache: Accelerating Diffusion Transformers via Self-Reflection, arXiv 2025.
Zou, Chang and Liu, Xuyang and Guo, Runlin and others.
-
[65] SpecDiff: Accelerating Diffusion Model Inference with Self-Speculation, arXiv 2025.
Shanghai Jiao Tong University Research Team.
[Paper]
-
[66] BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching, arXiv 2025.
Beijing Normal University Research Team.
[Paper]
-
[67] DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration, arXiv 2025.
Xiamen University, Zhejiang University, Wuhan University Research Team.
[Paper]
-
[68] LightningCP: Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation, arXiv 2025.
Nanyang Technological University Research Team.
[Paper]
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
Thanks to all researchers and contributors who have worked on diffusion model acceleration and caching techniques.