๐EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models
July 7, 2025 ยท View on GitHub
๐ Introduction
EDA-DM is a novel post-training quantization method for accelerating diffusion models. In the low-bit cases, it maintains high-quality image generation without any computational overhead. To address the distribution mismatch issues at both calibration sample level and reconstruction output level, we propose TDAC to address the calibration sample level mismatch, and propose FBR to eliminate the reconstruction output level mismatch.
๐นChallenges
๐นOverview Methods
This repository provides the official implementation for EDA-DM calibration, training, inference, and evaluation without any reservation.
๐ Getting Started
๐๏ธ Installation
Clone this repository, and then create and activate a suitable conda environment named EDA-DM by using the following command:
git clone https://github.com/BienLuky/EDA-DM.git
cd EDA-DM
conda env create -f env.yaml
conda activate EDA-DM
๐ง Usage
-
For Latent Diffusion and Stable Diffusion experiments, first download relevant checkpoints following the instructions in the latent-diffusion and stable-diffusion repos from CompVis. We currently use
sd-v1-4.ckptfor Stable Diffusion. -
Then use the following commands to run:
# CIFAR-10 (DDIM)
bash scripts/for_cifar.sh
# LSUN Bedroom (LDM-4)
bash scripts/for_bedroom.sh
# LSUN Church (LDM-8)
bash scripts/for_church.sh
# ImageNet (LDM-4)
bash scripts/for_imagenet.sh
# COCO (Stable Diffusion)
bash scripts/for_coco.sh
๐ EDA-DM Weights
Here, we provide some EDA-DM quantized weights. Due to the space limitations of Google Drive we only provide partial weights.
| Model | Dataset | Prec. | Link |
|---|---|---|---|
| DDIM | CIFAR-10 | W4A8 | link |
| LDM-4 | ImageNet | W4A8 | link |
๐ Evaluation
We provide the evaluation code in scripts/test.py. But before using it you need to download torch-fidelity, pytorch-fid and clip-score to EDA-DM.
๐ Details
EDA-DM follows a standard post-training quantization (PTQ) pipeline, which first collects a small calibration and then optimizes quantization parameters via reconstruction-based training. Notably, many existing approaches design time-step quantization parameters for diffusion models, which significantly improve performance but at the cost of deployment efficiency. In contrast, our method adopts a single quantization parameter shared across all diffusion steps. Experiments demonstrate that EDA-DM not only achieves superior quantization performance but also maintains high deployment efficiency.
๐ ๏ธ Deployment
The quantized models are deployed by utilizing CUTLASS and the same deployment toolkit of SmoothQuant. The specifical implementation is based on the open-source project torch_quantizer.
๐ Result
๐นRandom samples
Stable-Diffusion (1.83ร Acceleration)
LDM-4-ImageNet (1.88ร Acceleration)
LDM-4-Bedroom (1.78ร Acceleration)
LDM-8-Church (1.75ร Acceleration)
๐นCompression and Speedup
We deploy the quantized models on RTX 3090 GPU, CPU, and ARM.
๐ Acknowledgments
This code was developed based on Q-diffusion and BRECQ. We thank torch_quantizer for providing the reference to deploy our quantized model and measure acceleration. We also thank torch-fidelity, pytorch-fid, and clip-score for IS, sFID, FID and CLIP score computation.
๐ Citation
If you find this work useful in your research, please consider citing our paper:
@article{liu2024enhanced,
title={Enhanced distribution alignment for post-training quantization of diffusion models},
author={Liu, Xuewen and Li, Zhikai and Xiao, Junrui and Gu, Qingyi},
journal={arXiv e-prints},
pages={arXiv--2401},
year={2024}
}