dLLM-Var: Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

November 3, 2025 · View on GitHub

dLLM-Var: Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way

Yicun Yang^{1 *}, Cong Wang¹, Shaobo Wang¹, Zichen Wen¹, Biqing Qi², Hanlin Xu³, Linfeng Zhang^{1 †}

¹Shanghai Jiao Tong University, ²Shanghai AI Lab, ³Huawei

Key Features

Native Variable-Length Generation: Guided by the [EOS] token, it supports arbitrary length output without fixed hyperparameters.
High Parallelism: Inherits the bidirectional attention of dLLM, supporting blockwise diffusion inference.
KV Cache Compatible: Seamlessly reuses the KV cache, avoiding complex designs and improving efficiency.

Figure 1: The evolution of probabilistic modeling paradigms for text generation. From autoregressive (AR) to diffusion-based methods. dLLM-Var achieves variable-length generation while maintaining high parallelism.

Installation

Requirements

Python 3.12
PyTorch 2.5+ (H-series GPUs support FP8 mixed precision)

Quick Installation

# Clone the repository
git clone https://github.com/maomaocun/dLLM-Var.git
cd dLLM-Var
bash install.sh

Quick Start

Demo

demo_dLLM-var.py

Evaluation

cd ./evaluation
bash run_batch.sh

You need to change the environment variables

Prepare Dataset

cd datset
python transfer_text2token.py --input_dir "path/to/your/input/jsonl/folder" --output_file "path/to/your/output/tokenized.jsonl" --tokenizer_model "path/to/your/LLaDA-8B-Base"

For detailed dataset format, see ./sft_training/data/dataset.py.

Training Script

Training uses DeepSpeed ZeRO-2 and supports multi-GPU. Example command:

cd ./sft_training
bash run_gpus_fp8.sh

For detailed training configuration, see ./sft_training/config/sft/default_config.yaml.

Citation

If you find this work useful, please cite:

@misc{yang2025diffusionllmnativevariable,
      title={Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way}, 
      author={Yicun Yang and Cong Wang and Shaobo Wang and Zichen Wen and Biqing Qi and Hanlin Xu and Linfeng Zhang},
      year={2025},
      eprint={2510.24605},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.24605}, 
}

License

MIT License.

Contact

Project Lead: yangyicun187@gmail.com
Corresponding Author: zhanglinfeng@sjtu.edu.cn