Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles
July 18, 2025 ยท View on GitHub
Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles" (Slow Fast Sampling).
The Three Golden Principles: Certainty ยท Convergence ยท Positional

Fig. 1 โ Throughput and Accuracy Comparison on GPQA (8-shot, Length=1024) with LLaDA and Our Proposed Methods.
โจ Key Highlights
| ๐๐๐ | What makes Slow Fast Sampling special? |
|---|---|
| Three Golden Principles ๐ | Certainty, Convergence, Positional guide exactly when and where to decode. |
| Two-Stage Dance ๐ขโโก | Cautious Slow phase finds a stable span, then the Fast phase parallel-decodes it in one swoop. |
| Plug-and-Play ๐ | Drop-in sampler for any masked-diffusion LLM: LLaDA-8B, Dream-7B. |
| Crazy Speed-ups โก | 15.6 ร faster than vanilla diffusion; 34.2 ร with dLLM-Cache โwith minimal accuracy loss. |
| Outruns ARMs ๐ | Beats LLaMA-3 8B in throughput while matching accuracy (Table 4, p. 9). |
๐ Pipeline at a Glance

Fig. 2 โ Overview of the Slow Fast Sampling Pipeline: From Exploratory to Accelerated Decoding.
๐ ๏ธ Installation
# 1. Clone
git clone https://github.com/LiangrunFlora/Slow-Fast-Sampling.git
cd slow-fast-sampling
# 2. Env (Python โฅ 3.10) & Deps
bash install.sh
๐ Quick Start
# GSM8K with LLaDA-8B
bash scripts/run_llada_gsm8k_base.sh
# GPQA with LLaDA-8B
bash scripts/run_llada_gpqa_base.sh
# BBH with Dream-7B
bash scripts/run_dream_bbh_base.sh
๐ฎ Contact
Created and maintained by Qingyan Wei (liangrun@csu.edu.cn). Feel free to open an issue or drop me an emailโPRs are welcome!
๐ Acknowledgements
This project stands on the shoulders of LLaDA, Dream, dLLM-Cache and the lm-evaluation-harness. Huge thanks to these amazing communities for paving the way.
๐ Citation
If you find this work useful, please cite our paper:
@article{wei2025accelerating,
title={Accelerating Diffusion Large Language Models with SlowFast: The Three Golden Principles},
author={Wei, Qingyan and Zhang, Yaojie and Liu, Zhiyuan and Liu, Dongrui and Zhang, Linfeng},
journal={arXiv preprint arXiv:2506.10848},
year={2025}
}