BiGain: Unified Token Compression for Joint Generation and Classification

March 19, 2026 · View on GitHub

Official implementation of the CVPR 2026 paper BiGain: Unified Token Compression for Joint Generation and Classification.

Framework of our BiGain_TM method. A Laplacian filter is applied to hidden-state tokens to compute local frequency scores. In each spatial stride, the lowest-scoring token is selected as a destination token, while the others form the source set. Destination and source tokens are gathered globally, and a bipartite matching selects top source-destination pairs.

Environment Setup

Create conda environment:

conda create -n diffusion-exp python=3.9
conda activate diffusion-exp

Install dependencies:

pip install -r requirements.txt

Install ToMe package:

cd tomesd && python setup.py build develop && cd ..

Set dataset path:

export DATASET_ROOT=/path/to/your/datasets

Running Experiments

All experiments are configured through shell scripts in the scripts/ directory.

Classification Experiments

# Stable Diffusion zero-shot classification
bash scripts/sd_classification.sh

# DiT (Diffusion Transformer) classification
bash scripts/dit.sh

Generation Experiments

# Stable Diffusion image generation
bash scripts/sd_generation.sh

# DiT image generation
bash scripts/dit_generation.sh

Configuration

To run an experiment:

Open the desired script
Uncomment one of the example configurations
Adjust parameters if needed
Run the script

Available Methods

Baseline: No acceleration
ToMe: Original Token Merging
BiGain_TM (LGTM): Our scoring-based token merging method
ToDo: Token downsampling
BiGain_TD (IEKVD): Our linear blend token downsampling method
SiTo: Similarity-based token pruning

Code Attribution

This implementation builds upon code from:

Zero-shot classification framework: Li et al., "Your Diffusion Model is Secretly a Zero-Shot Classifier", ICCV 2023 [Paper] [Code]
ToMe: Bolya & Hoffman, "Token Merging for Fast Stable Diffusion", CVPR Workshops 2023 (MIT License) [Paper] [Code]
ToDo: Smith et al., "ToDo: Token Downsampling for Efficient Generation of High-Resolution Images", IJCAI 2024 [Paper] [Code]
SiTo: Zhang et al., "Training-Free and Hardware-Friendly Acceleration for Diffusion Models via Similarity-based Token Pruning", AAAI 2025 [Paper] [Code]