WINO-DLLM
May 29, 2026 ยท View on GitHub
Official implementation of Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs.
This repository provides scripts and instructions to evaluate WINO on LLaDA and MMaDA.
Related Projects
-
We are continuing to improve efficient DLLM inference with ReMix: Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference, accepted to CVPR 2026. ReMix is a training-free decoding method that further explores fast semantic propagation for mask tokens, and it provides unified evaluation scripts for both LLaDA and MMaDA.
-
We further extend WINO with WINO+: Diffusion LLMs are Their Own Efficiency Teachers, a journal extension that learns from offline WINO trajectories to further improve DLLM inference efficiency. The WINO+ workflow and released model weights are available at WINO+.
Evaluation of WINO on LLaDA
- Installation We recommend using uv for dependency and virtual environment management.
pipx install uv # or pip install uv
cd LLaDA
uv venv --python 3.11 dev
source dev/bin/activate
uv pip install -r requirements.txt
- Prepare Model and Datasets
Before running inference or evaluation, please download the following models and datasets from Hugging Face into the specified local directories (e.g., ./LLaDA/models/ and ./LLaDA/data/).
You may use either huggingface-cli or the Python datasets library to complete the download.
| Model Name | Hugging Face Repo | Local Path |
|---|---|---|
| LLaDA-8B-Instruct | GSAI-ML/LLaDA-8B-Instruct | ./LLaDA/models/LLaDA-8B-Instruct/ |
| Dataset Name | Hugging Face Repo | Local Path |
|---|---|---|
| GSM8K | openai/gsm8k | ./LLaDA/data/gsm8k/ |
| MATH-500 | HuggingFaceH4/MATH-500 | ./LLaDA/data/math500/ |
| HumanEval | openai/openai_humaneval | ./LLaDA/data/humaneval/ |
| ai2_arc | allenai/ai2_arc | ./LLaDA/data/ai2_arc/ |
Datasets not listed above are already included in the ./LLaDA/data/ directory
- Quick Demo
Please make sure to set the correct model path in generate.py.
python generate.py
- Evaluation
To evaluate WINO on a benchmark such as GSM8K. Please configure the model and data paths in the corresponding config file.
CUDA_VISIBLE_DEVICES=0 python eval.py --config ./configs/gsm8k.yaml
All available config files can be found in the ./LLaDA/configs/ directory.
Evaluation of WINO on MMaDA
We evaluate WINO using lmms-eval.
To run the evaluation, follow these steps:
- Install MMaDA dependencies
cd MMaDA
# pipx install uv
uv venv --python 3.11 dev
source dev/bin/activate
uv pip install -r requirements.txt
A quick inference demo can be performed after this step.
python generate_demo.py
- Install lmms-eval dependencies
cd lmms_eval
uv pip install -e .
- Set some necessary environmental variables Some environmental variables are necessary for certain tasks to run.
export OPENAI_API_KEY="<YOUR_API_KEY>"
export HF_HOME="<Path to HF cache>"
export HF_TOKEN="<YOUR_API_KEY>"
export HF_HUB_ENABLE_HF_TRANSFER="1"
Once all dependencies are installed and your API key is set, you can run the evaluation script directly:
cd ..
# Evaluating MMaDA on the reported six multimodel benchmarks
bash scripts/eval_baseline.sh
# Evaluating WINO on the reported six multimodel benchmarks
bash scripts/eval_wino.sh