MagCache4FLUX

June 11, 2025 ยท View on GitHub

MagCache can speedup FLUX 2.8x without much visual quality degradation, in a training-free manner.

visualization

๐Ÿ“ˆ Inference Latency Comparisons on a Single A800

FLUX.1 [dev]TeaCache (0.6)MagCache (E024K5R01)
~14.26 s~5.65 s
2.5x sppedup
~5.05 s
2.8x sppedup


Prompt: A photo of a black bicycle. Prompt: A photo of a black bicycle.

Installation

pip install --upgrade diffusers[torch] transformers protobuf tokenizers sentencepiece

Usage

You can modify the 'magcache_thresh', 'magcache_K', and 'retention_ratio' in lines 455-457 to obtain your desired trade-off between latency and visul quality. For single-gpu inference, you can use the following command:

python magcache_flux.py

Citation

If you find MagCache is useful in your research or applications, please consider giving us a star ๐ŸŒŸ and citing it by the following BibTeX entry.

@misc{ma2025magcachefastvideogeneration,
      title={MagCache: Fast Video Generation with Magnitude-Aware Cache}, 
      author={Zehong Ma and Longhui Wei and Feng Wang and Shiliang Zhang and Qi Tian},
      year={2025},
      eprint={2506.09045},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.09045}, 
}

Acknowledgements

We would like to thank the contributors to the FLUX, TeaCache, and Diffusers.