SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation
March 14, 2026 Β· View on GitHub
Xingtong Ge1,2, Xin Zhang3, Tongda Xu4, Yi Zhang3, Xinjie Zhang1, Yan Wang4, Jun Zhang1*
1The Hong Kong University of Science and Technology, 2SenseTime Research, 3Vivix AI, 4Institute for AI Industry Research, Tsinghua University
The Fourteenth International Conference on Learning Representations (ICLR), 2026
π Abstract
The Distribution Matching Distillation (DMD) has been successfully applied to text-to-image diffusion models such as Stable Diffusion (SD) 1.5. However, vanilla DMD suffers from convergence difficulties on large-scale flow-based text-to-image models, such as SD 3.5 and FLUX. In this paper, we first analyze the issues when applying vanilla DMD on large-scale models. Then, to overcome the scalability challenge, we propose implicit distribution alignment (IDA) to regularize the distance between the generator and fake distribution. Furthermore, we propose intra-segment guidance (ISG) to relocate the timestep importance distribution from the teacher model. With IDA alone, DMD converges for SD 3.5; employing both IDA and ISG, DMD converges for SD 3.5 and FLUX.1 dev. Along with other improvements such as scaled up discriminator models, our final model, dubbed SenseFlow, achieves superior performance in distillation for both diffusion based text-to-image models such as SDXL, and flow-matching models such as SD 3.5 Large and FLUX. The source code and model weights are now available.

β TODO List
- Single-node training scripts
- Multi-node training scripts
- Inference scripts
- Open-source model weights
π€ Model Weights
We have open-sourced model weights on Hugging Face for the community. All models are available at: domiso/SenseFlow.
SenseFlow-FLUX (4β8 step generation)
- Hugging Face: domiso/SenseFlow (see
SenseFlow-FLUX/folder) - Contents: DiT checkpoint (
.safetensors),config.json
Quick Start:
- Download the base FLUX.1-dev checkpoint to
Path/to/FLUX - Download SenseFlow-FLUX from Hugging Face and replace the transformer folder:
# Replace Path/to/FLUX/transformer with SenseFlow-FLUX folder - Use the model with diffusers (see Hugging Face model card for detailed usage examples)
SenseFlow SD 3.5 Large & Medium
We release SenseFlow SD 3.5 Large and SenseFlow SD 3.5 Medium distilled weights for community use. Both support few-step text-to-image generation.
- Hugging Face: domiso/SenseFlow
- Download the corresponding SD 3.5 Large/Medium folders and follow the model card for usage with diffusers or this repoβs inference scripts.
π» Installation
We provide two methods to set up the environment: using conda with environment.yaml or using pip with requirements.txt.
Option 1: Using Conda (Recommended)
-
Create a new conda environment from the provided
environment.yaml:conda env create -f environment.yaml -
Activate the environment:
conda activate senseflow -
Install the package in editable mode:
pip install -e .
Option 2: Using Pip
-
Create a new virtual environment (Python 3.10 is required):
python3.10 -m venv senseflow_env source senseflow_env/bin/activate -
Install PyTorch with CUDA support first (compatible with CUDA 12.4):
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124 -
Install the remaining dependencies:
pip install -r requirements.txt -
Install the package in editable mode:
pip install -e .
βοΈ Setup
Before training, you need to download the pretrained teacher models, prepare the dataset, and configure the paths in the corresponding config YAML file. All paths are managed in the paths section of each config file β no need to edit Python source code.
Pretrained Models
SDXL
huggingface-cli download stabilityai/stable-diffusion-xl-base-1.0 --local-dir /path/to/stable-diffusion-xl-base-1.0
SD3.5 Medium
huggingface-cli download stabilityai/stable-diffusion-3.5-medium --local-dir /path/to/stable-diffusion-3.5-medium
SD3.5 Large
huggingface-cli download stabilityai/stable-diffusion-3.5-large --local-dir /path/to/stable-diffusion-3.5-large
FLUX
huggingface-cli download black-forest-labs/FLUX.1-dev --local-dir /path/to/FLUX.1-dev
After downloading FLUX.1-dev, create symlinks for the transformer without guidance embedding:
mkdir -p exp_flux/flux-wo-guidance-embed/transformer
cd exp_flux/flux-wo-guidance-embed/transformer
for file in /path/to/FLUX.1-dev/transformer/*; do
filename=$(basename "$file")
if [ "$filename" != "config.json" ]; then
ln -s "$file" "$filename"
fi
done
The config.json with guidance_embeds: false is already provided in exp_flux/flux-wo-guidance-embed/transformer/config.json.
Dataset Preparation
SDXL uses LMDB datasets from DMD2. Download the LMDB dataset files and note the local path.
SD3.5 Medium/Large and FLUX use text-image datasets with a JSON file:
{
"keys": ["00000000", "00000001", "00000002"],
"image_paths": [
"/path/to/images/00000000.png",
"/path/to/images/00000001.png",
"/path/to/images/00000002.png"
],
"prompts": [
"A beautiful sunset over the ocean",
"A cat sitting on a windowsill",
"A modern city skyline at night"
]
}
Important: The three lists (keys, image_paths, prompts) must have the same length. Image paths should be absolute paths.
Path Configuration
All paths are configured in the paths section of each config YAML file. Edit the corresponding config file before training:
SDXL SenseFlow (configs/sdxl/sdxl_senseflow.yaml):
paths:
pretrained_model: /path/to/stable-diffusion-xl-base-1.0
dataset: /path/to/lmdb_dataset
SDXL DMD2 (configs/sdxl/sdxl_dmd2.yaml):
paths:
pretrained_model: /path/to/stable-diffusion-xl-base-1.0
dataset: /path/to/lmdb_dataset
SD3.5 Medium (configs/SD35/sd35_senseflow.yaml):
paths:
pretrained_model: /path/to/stable-diffusion-3.5-medium
dataset: /path/to/dataset.json
SD3.5 Large (configs/SD35/sd35_large_senseflow.yaml):
paths:
pretrained_model: /path/to/stable-diffusion-3.5-large
dataset: /path/to/dataset.json
FLUX (configs/FLUX/flux_senseflow.yaml):
paths:
pretrained_model: /path/to/FLUX.1-dev
flux_wo_guidance_embed: exp_flux/flux-wo-guidance-embed
dataset: /path/to/dataset.json
ποΈ Training
We provide training scripts in the exp_* directories. Each script takes 4 arguments: number of nodes, number of GPUs per node, config file path, and save directory path.
FLUX SenseFlow
sh exp_flux/train_flux_senseflow.sh \
1 8 \
configs/FLUX/flux_senseflow.yaml \
/path/to/save/directory
SDXL SenseFlow
sh exp_sdxl/train_sdxl_senseflow.sh \
1 8 \
configs/sdxl/sdxl_senseflow.yaml \
/path/to/save/directory
SDXL DMD2
sh exp_sdxl/train_sdxl_dmd2.sh \
1 8 \
configs/sdxl/sdxl_dmd2.yaml \
/path/to/save/directory
SD3.5 Medium SenseFlow
sh exp_sd35/train_SD35_senseflow.sh \
1 8 \
configs/SD35/sd35_senseflow.yaml \
/path/to/save/directory
SD3.5 Large SenseFlow
sh exp_sd35/train_SD35_large_senseflow.sh \
1 8 \
configs/SD35/sd35_large_senseflow.yaml \
/path/to/save/directory
Training Arguments:
- First argument: Number of nodes
- Second argument: Number of GPUs per node
- Third argument: Path to config file
- Fourth argument: Path to save directory
π¨ Inference
We provide inference scripts for different models:
FLUX SenseFlow
python scripts_flux/test_flux_senseflow.py \
--flux_ckpt /path/to/FLUX.1-dev \
--checkpoint /path/to/senseflow_checkpoint.pth \
--output_dir ./outputs
SDXL SenseFlow
python scripts_sdxl/test_sdxl_senseflow.py \
--sdxl_ckpt /path/to/stable-diffusion-xl-base-1.0 \
--checkpoint /path/to/senseflow_checkpoint.pth \
--output_dir ./outputs
SDXL DMD2
python scripts_sdxl/test_sdxl_dmd2.py \
--sdxl_ckpt /path/to/stable-diffusion-xl-base-1.0 \
--checkpoint /path/to/dmd2_checkpoint.pth \
--output_dir ./outputs
SD3.5 Medium SenseFlow
python scripts_sd35/test_senseflow_sd35.py \
--sd35_ckpt /path/to/stable-diffusion-3.5-medium \
--checkpoint /path/to/senseflow_checkpoint.pth \
--output_dir ./outputs
SD3.5 Large SenseFlow
python scripts_sd35/test_senseflow_sd35_large.py \
--sd35_ckpt /path/to/stable-diffusion-3.5-large \
--checkpoint /path/to/senseflow_checkpoint.pth \
--output_dir ./outputs
Inference Arguments
All inference scripts support the following optional arguments:
--prompts_file: Path to prompts text file (default:senseflow_test_prompts.txt)--start_idx: Starting index in prompts file (default: 0)--num_prompts: Number of prompts to process (default: 23)--batch_size: Batch size for inference (default: 1)--output_dir: Output directory for generated images (default:./outputs)
For FLUX:
--dit_config: Path to DIT transformer config file (default:exp_flux/flux-wo-guidance-embed/transformer/config.json)
For SDXL:
--unet_config: Path to UNet config file (default:<sdxl_ckpt>/unet/config.json)
For SD35:
--transformer_config: Path to transformer config file (default:<sd35_ckpt>/transformer/config.json)
π Results
Table 1: Quantitative Results on COCO-5K Dataset
Bold = best, Underline = second best. All results on 4-step generation.
Stable Diffusion XL Comparison
| Method | NFE | FID-T | Patch FID-T | CLIP | HPSv2 | Pick | ImageReward |
|---|---|---|---|---|---|---|---|
| SDXL | 80 | -- | -- | 0.3293 | 0.2930 | 22.67 | 0.8719 |
| LCM-SDXL | 4 | 18.47 | 30.63 | 0.3230 | 0.2824 | 22.22 | 0.5693 |
| PCM-SDXL | 4 | 14.38 | 17.77 | 0.3242 | 0.2920 | 22.54 | 0.6926 |
| Flash-SDXL | 4 | 17.97 | 23.24 | 0.3216 | 0.2830 | 22.17 | 0.4295 |
| SDXL-Lightning | 4 | 13.67 | 16.57 | 0.3214 | 0.2931 | 22.80 | 0.7799 |
| Hyper-SDXL | 4 | 13.71 | 17.49 | 0.3254 | 0.3000 | 22.98 | 0.9777 |
| DMD2-SDXL | 4 | 15.04 | 18.72 | 0.3277 | 0.2963 | 22.98 | 0.9324 |
| Ours-SDXL | 4 | 17.76 | 21.01 | 0.3248 | 0.3010 | 23.17 | 0.9951 |
Stable Diffusion 3.5 Comparison
| Method | NFE | FID-T | Patch FID-T | CLIP | HPSv2 | Pick | ImageReward |
|---|---|---|---|---|---|---|---|
| SD 3.5 Large | 100 | -- | -- | 0.3310 | 0.2993 | 22.98 | 1.1629 |
| SD 3.5 Large Turbo | 4 | 13.58 | 22.88 | 0.3262 | 0.2909 | 22.89 | 1.0116 |
| Ours-SD 3.5 | 4 | 13.38 | 17.48 | 0.3286 | 0.3016 | 23.01 | 1.1713 |
| Ours-SD 3.5 (Euler) | 4 | 15.24 | 20.26 | 0.3287 | 0.3008 | 22.90 | 1.2062 |
FLUX Comparison
| Method | NFE | FID-T | Patch FID-T | CLIP | HPSv2 | Pick | ImageReward |
|---|---|---|---|---|---|---|---|
| FLUX.1 dev | 50 | -- | -- | 0.3202 | 0.3000 | 23.18 | 1.1170 |
| FLUX.1 dev | 25 | -- | -- | 0.3207 | 0.2986 | 23.14 | 1.1063 |
| FLUX.1-schnell | 4 | -- | -- | 0.3264 | 0.2962 | 22.77 | 1.0755 |
| Hyper-FLUX | 4 | 11.24 | 23.47 | 0.3238 | 0.2963 | 23.09 | 1.0983 |
| FLUX-Turbo-Alpha | 4 | 11.22 | 24.52 | 0.3218 | 0.2907 | 22.89 | 1.0106 |
| Ours-FLUX | 4 | 15.64 | 19.60 | 0.3167 | 0.2997 | 23.13 | 1.0921 |
| Ours-FLUX (Euler) | 4 | 16.50 | 20.29 | 0.3171 | 0.3008 | 23.26 | 1.1424 |
1024 x 1024 examples of our 4-step generator distilled on SD 3.5 Large
1024 x 1024 examples of our 4-step generator distilled on SDXL
π Citation
If you find this work useful, please cite:
@article{ge2025senseflow,
title={SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation},
author={Ge, Xingtong and Zhang, Xin and Xu, Tongda and Zhang, Yi and Zhang, Xinjie and Wang, Yan and Zhang, Jun},
journal={arXiv preprint arXiv:2506.00523},
year={2025}
}
βοΈ License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Note: This codebase is based on several open-source models including:
- Stable Diffusion XL (CreativeML Open RAIL-M License)
- Stable Diffusion 3.5 (CreativeML Open RAIL-M License)
- FLUX.1-dev (CreativeML Open RAIL-M License)
Please ensure compliance with their respective licenses when using the teacher models.