| Accelerating Large Language Model Decoding with Speculative Sampling | arxiv 2023 |  |
| Masked diffusion models are secretly time-agnostic masked models and exploit inaccurate categorical sampling | ICLR 2025 | - |
| A continuous time framework for discrete denoising models | NeurIPS 2022 |  |
| Discrete diffusion modeling by estimating the ratios of the data distribution | ICML 2024 |  |
| Simplified and generalized masked diffusion for discrete data | NeurIPS 2024 |  |
| Seed Diffusion | arxiv 2025 | - |
| Target concrete score matching: A holistic framework for discrete diffusion | ICML 2025 | - |
| Discrete diffusion modeling by estimating the ratios of the data distribution | ICML 2024 |  |
| Score-based continuous-time discrete diffusion models | ICLR 2023 | - |
| Fast-dllm: Training-free acceleration of diffusion llm by enabling kv cache and parallel decoding | arxiv 2025 |  |
| Large language diffusion models | ICLR 2025 |  |
| Beyond autoregression: Discrete diffusion for complex reasoning and planning | ICLR 2025 |  |
| A reparameterized discrete diffusion model for text generation | COLM 2024 |  |
| Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions | ICML 2025 | - |
| Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking | arxiv 2025 | - |
| Accelerating Diffusion Large Language Models with SlowFast: The Three Golden Principles | arxiv 2025 |  |
| A continuous time framework for discrete denoising models | NeurIPS 2022 |  |
| Remasking discrete diffusion models with inference-time scaling | ICLR 2025 |  |
| Simplified and generalized masked diffusion for discrete data | NeurIPS 2024 |  |
| Path planning for masked diffusion model sampling | arxiv 2025 |  |
| Think while you generate: Discrete diffusion with planned denoising | ICLR 2025 |  |
| Accelerating Diffusion LLMs via Adaptive Parallel Decoding | arxiv 2025 | - |
| Reviving any-subset autoregressive models with principled parallel sampling and speculative decoding | arxiv 2025 |  |
| dkv-cache: The cache for diffusion language models | arxiv 2025 |  |
| Accelerating diffusion language model inference via efficient kv caching and guided diffusion | arxiv 2025 | - |
| Esoteric Language Models | arxiv 2025 |  |
| Beyond Autoregression: Fast LLMs via Self-Distillation Through Time | ICLR 2025 | - |
| Cllms: Consistency large language models | ICML 2024 |  |
| The diffusion duality | ICML 2025 |  |
| d1: Scaling reasoning in diffusion large language models via reinforcement learning | arxiv 2025 |  |
| LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models | AAAI 2025 |  |
| DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation | arxiv 2025 |  |
| Scaling diffusion language models via adaptation from autoregressive models | ICLR 2025 |  |
| Dream 7B | arxiv 2025 |  |
| DIFFPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models | ACL 2025 |  |
| Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs | arxiv 2025 |  |
| Dream-Coder 7B: An Open Diffusion Language Model for Code | arxiv 2025 |  |
| Spg: Sandwiched policy gradient for masked diffusion language models | arxiv 2025 |  |
| Revolutionizing reinforcement learning framework for diffusion large language models | ICLR 2026 |  |
| Diffusion llms can do faster-than-ar inference via discrete diffusion forcing | ICLR 2026 |  |
| WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference | arxiv 2025 |  |
| d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models | arxiv 2025 |  |
| d2: Improved Techniques for Training Reasoning Diffusion Language Models | arxiv 2025 | - |
| wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models | arxiv 2025 |  |
| Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models | arxiv 2025 |  |
| Improving reasoning for diffusion language models via group diffusion policy optimization | arxiv 2025 |  |
| The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models | arxiv 2026 |  |
| Principled rl for diffusion llms emerges from a sequence-level perspective | ICLR 2026 |  |
| Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas | arxiv 2026 |  |
| SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation | arxiv 2025 |  |
| Lopa: Scaling dllm inference via lookahead parallel decoding | arxiv 2025 |  |
| FAST-dLLM V2: Efficient Block-Diffusion LLM | ICLR 2026 |  |
| d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation | arxiv 2026 |  |
| dParallel: Learnable Parallel Decoding for dLLMs | ICLR 2026 |  |
| Diffusion language models know the answer before decoding | ICLR 2026 |  |
| Creditdecoding: Accelerating parallel decoding in diffusion large language models with trace credits | arxiv 2025 | - |
| Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models | ICLR 2025 |  |
| Set Block Decoding is a Language Model Inference Accelerator | arxiv 2025 | - |
| LLaDA-MoE: A Sparse MoE Diffusion Language Model | arxiv 2025 |  |
| dInfer: An Efficient Inference Framework for Diffusion Language Models | arxiv 2025 |  |
| ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs | ICLR 2026 |  |