MagCache4FLUX

June 11, 2025 · View on GitHub

MagCache can speedup FLUX 2.8x without much visual quality degradation, in a training-free manner.

visualization

📈 Inference Latency Comparisons on a Single A800

FLUX.1 [dev]	TeaCache (0.6)	MagCache (E024K5R01)
~14.26 s	~5.65 s 2.5x sppedup	~5.05 s 2.8x sppedup

Prompt: A photo of a black bicycle.

Installation

pip install --upgrade diffusers[torch] transformers protobuf tokenizers sentencepiece

You can modify the 'magcache_thresh', 'magcache_K', and 'retention_ratio' in lines 455-457 to obtain your desired trade-off between latency and visul quality. For single-gpu inference, you can use the following command:

python magcache_flux.py

Citation

If you find MagCache is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@misc{ma2025magcachefastvideogeneration,
      title={MagCache: Fast Video Generation with Magnitude-Aware Cache}, 
      author={Zehong Ma and Longhui Wei and Feng Wang and Shiliang Zhang and Qi Tian},
      year={2025},
      eprint={2506.09045},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.09045}, 
}

Acknowledgements

We would like to thank the contributors to the FLUX, TeaCache, and Diffusers.

📈 Inference Latency Comparisons on a Single A800

Installation

Usage

Citation

Acknowledgements