SJD: Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

April 21, 2025 · View on GitHub

Yao Teng¹, Han Shi², Xian Liu³, Xuefei Ning⁴, Guohao Dai^5,6, Yu Wang⁴, Zhenguo Li², and Xihui Liu¹.

¹The University of Hong Kong, ²Huawei Noah’s Ark Lab, ³CUHK, ⁴Tsinghua University, ⁵Shanghai Jiao Tong University, ⁶Infinigence AI

🚩 New Features/Updates

✅ Apr, 2025. 💥 SJD has been integrated into Lumina-mGPT2 and SimpleAR.
✅ Jan, 2025. 💥 SJD is accepted to ICLR 2025.
✅ Oct, 2024. Release SJD's code.

🚩 TODO List

□ Integrating SJD into vLLM framework for further acceleration.

Installing the dependencies

Environment:

Python 3.10
CUDA 12.5
Pytorch 2.5.1+cu124
Transformers 4.47.1

Install from `yaml`:

conda env create -f environment.yaml

Performance

Results on Lumina-mGPT
Results on Emu3

Text-to-Image with SJD

Lumina-mGPT

CUDA_VISIBLE_DEVICES=0 python test_lumina_mgpt.py

Emu3

CUDA_VISIBLE_DEVICES=0 python test_emu3.py

LlamaGen

CUDA_VISIBLE_DEVICES=0 python test_llamagen.py

Acknowledge

Our code is based on Lumina-mGPT, Emu3, LlamaGen, Anole, and CLLM. We would like to express our gratitude to Tianwei Xiong for his assistance.

Citation

@article{teng2024accelerating,
  title={Accelerating auto-regressive text-to-image generation with training-free speculative jacobi decoding},
  author={Teng, Yao and Shi, Han and Liu, Xian and Ning, Xuefei and Dai, Guohao and Wang, Yu and Li, Zhenguo and Liu, Xihui},
  journal={arXiv preprint arXiv:2410.01699},
  year={2024}
}