[ICML 2026] dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching

May 1, 2026 · View on GitHub

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache) in ICML 2026.

:fire: News

[2026/05/01] Our dLLM-Cache paper has been accepted to ICML 2026. Thanks!
[2025/06/15] Our dLLM-Cache is compatible with MMaDA.
[2025/05/31] Our dLLM-Cache is integrated into LLaDA-V.
[2025/05/23] The code of our paper has been released.
[2025/05/17] Our paper has been released.

✨️ Key Highlights

radar_speed

Currently supported models: LLaDA, Dream, LLaDA-V and MMaDA.
Speedup: Achieves up to 9.1x speedup over standard dLLM pipelines, with no performance loss on most tasks.
Evaluation: Evaluated on LLaDA 8B and Dream 7B.
Latency: Approaches ARM-level inference speeds in many scenarios.

:rocket: Pipeline

Here's an overview of the process behind our dLLM-Cache method: pipeline

🛠️ Installation

To get started with dLLM-Cache, follow the installation instructions below.

Clone the Repository:

git clone https://github.com/maomaocun/dLLM-Cache.git
cd dLLM-Cache

Set Up the Environment: Create a Python environment with conda or virtualenv and install dependencies:

bash install.sh

Demo:

python demo_{model_name}.py

Running Experiments: Run experiments using the provided scripts:

bash eval_scripts/run_{model_name}_{task_name}_base.sh

:blue_book: Example Usage

GSM8K with LLaDA

bash eval_scripts/run_LLaDA_gsm8k_base.sh

BBH with Dream

bash eval_scripts/run_Dream_bbh_base.sh

:postbox: Contact

If you have any questions, please email yangyicun187@gmail.com.

🎉 Acknowledgements

This repository was built off of LLaDA, Dream, LLaDA-V, MMaDA and lm-evaluation-harness.

:pushpin: Citation

If you find dLLM-Cache useful for your research and applications, please cite using this BibTeX:

@article{liu2025dllm,
  title={dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching},
  author={Liu, Zhiyuan and Yang, Yicun and Zhang, Yaojie and Chen, Junjie and Zou, Chang and Wei, Qingyuan and Wang, Shaobo and Zhang, Linfeng},
  journal={arXiv preprint arXiv:2506.06295},
  year={2025}
}