Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles

July 18, 2025 · View on GitHub

Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles" (Slow Fast Sampling).

The Three Golden Principles: Certainty · Convergence · Positional

Pipeline
Fig. 1 – Throughput and Accuracy Comparison on GPQA (8-shot, Length=1024) with LLaDA and Our Proposed Methods.

✨ Key Highlights

💗💗💗	What makes Slow Fast Sampling special?
Three Golden Principles 👑	**Certainty, Convergence, Positional** guide exactly when and where to decode.
Two-Stage Dance 🐢→⚡	Cautious Slow phase finds a stable span, then the Fast phase parallel-decodes it in one swoop.
Plug-and-Play 🔌	Drop-in sampler for any masked-diffusion LLM: LLaDA-8B, Dream-7B.
Crazy Speed-ups ⚡	15.6 × faster than vanilla diffusion; 34.2 × with `dLLM-Cache` —with minimal accuracy loss.
Outruns ARMs 🏃	Beats LLaMA-3 8B in throughput while matching accuracy (Table 4, p. 9).

🚀 Pipeline at a Glance

SFS-overview
Fig. 2 – Overview of the Slow Fast Sampling Pipeline: From Exploratory to Accelerated Decoding.

🛠️ Installation

# 1. Clone
git clone https://github.com/LiangrunFlora/Slow-Fast-Sampling.git
cd slow-fast-sampling

# 2. Env (Python ≥ 3.10) & Deps
bash install.sh

📘 Quick Start

# GSM8K with LLaDA-8B
bash scripts/run_llada_gsm8k_base.sh

# GPQA with LLaDA-8B
bash scripts/run_llada_gpqa_base.sh

# BBH with Dream-7B
bash scripts/run_dream_bbh_base.sh

📮 Contact

Created and maintained by Qingyan Wei (liangrun@csu.edu.cn). Feel free to open an issue or drop me an email—PRs are welcome!

🎉 Acknowledgements

This project stands on the shoulders of LLaDA, Dream, dLLM-Cache and the lm-evaluation-harness. Huge thanks to these amazing communities for paving the way.

📌 Citation

If you find this work useful, please cite our paper:

@article{wei2025accelerating,
  title={Accelerating Diffusion Large Language Models with SlowFast: The Three Golden Principles},
  author={Wei, Qingyan and Zhang, Yaojie and Liu, Zhiyuan and Liu, Dongrui and Zhang, Linfeng},
  journal={arXiv preprint arXiv:2506.10848},
  year={2025}
}