WINO-DLLM

May 29, 2026 · View on GitHub

Official implementation of Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs.

This repository provides scripts and instructions to evaluate WINO on LLaDA and MMaDA.

We are continuing to improve efficient DLLM inference with ReMix: Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference, accepted to CVPR 2026. ReMix is a training-free decoding method that further explores fast semantic propagation for mask tokens, and it provides unified evaluation scripts for both LLaDA and MMaDA.
We further extend WINO with WINO+: Diffusion LLMs are Their Own Efficiency Teachers, a journal extension that learns from offline WINO trajectories to further improve DLLM inference efficiency. The WINO+ workflow and released model weights are available at WINO+.

Evaluation of WINO on LLaDA

Installation We recommend using uv for dependency and virtual environment management.

pipx install uv # or pip install uv
cd LLaDA
uv venv --python 3.11 dev
source dev/bin/activate
uv pip install -r requirements.txt

Prepare Model and Datasets

Before running inference or evaluation, please download the following models and datasets from Hugging Face into the specified local directories (e.g., ./LLaDA/models/ and ./LLaDA/data/).

You may use either huggingface-cli or the Python datasets library to complete the download.

Model Name	Hugging Face Repo	Local Path
LLaDA-8B-Instruct	`GSAI-ML/LLaDA-8B-Instruct`	`./LLaDA/models/LLaDA-8B-Instruct/`

Dataset Name	Hugging Face Repo	Local Path
GSM8K	`openai/gsm8k`	`./LLaDA/data/gsm8k/`
MATH-500	`HuggingFaceH4/MATH-500`	`./LLaDA/data/math500/`
HumanEval	`openai/openai_humaneval`	`./LLaDA/data/humaneval/`
ai2_arc	`allenai/ai2_arc`	`./LLaDA/data/ai2_arc/`

Datasets not listed above are already included in the ./LLaDA/data/ directory

Quick Demo

Please make sure to set the correct model path in generate.py.

python generate.py

Evaluation

To evaluate WINO on a benchmark such as GSM8K. Please configure the model and data paths in the corresponding config file.

CUDA_VISIBLE_DEVICES=0 python eval.py --config ./configs/gsm8k.yaml

All available config files can be found in the ./LLaDA/configs/ directory.

Evaluation of WINO on MMaDA

We evaluate WINO using lmms-eval.

To run the evaluation, follow these steps:

Install MMaDA dependencies

cd MMaDA
# pipx install uv
uv venv --python 3.11 dev
source dev/bin/activate
uv pip install -r requirements.txt

A quick inference demo can be performed after this step.

python generate_demo.py

Install lmms-eval dependencies

cd lmms_eval
uv pip install -e .

Set some necessary environmental variables Some environmental variables are necessary for certain tasks to run.

export OPENAI_API_KEY="<YOUR_API_KEY>"
export HF_HOME="<Path to HF cache>" 
export HF_TOKEN="<YOUR_API_KEY>"
export HF_HUB_ENABLE_HF_TRANSFER="1"

Once all dependencies are installed and your API key is set, you can run the evaluation script directly:

cd ..
# Evaluating MMaDA on the reported six multimodel benchmarks
bash scripts/eval_baseline.sh
# Evaluating WINO on the reported six multimodel benchmarks
bash scripts/eval_wino.sh

Related Projects

Evaluation of WINO on LLaDA

Evaluation of WINO on MMaDA