README.md

December 22, 2024 · View on GitHub

MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing

Haoyu Zhao · Tianyi Lu · Jiaxi Gu · Xing Zhang · Qingping Zheng · Zuxuan Wu · Hang Xu · Yu-Gang Jiang

Fudan University | Huawei Noah's Ark Lab

📢 News

[2024.12.22] Release inference code. We are working to improve MagDiff, stay tuned!
[2024.07.04] Our paper has been accepted by the 18th European Conference on Computer Vision (ECCV) 2024.
[2023.11.29] Release first paper version on Arxiv.

🏃‍♂️ Getting Started

Download the pretrained base models for StableDiffusion V2.1.

Download our MagDiff checkpoints.

Please follow the huggingface download instructions to download the above models and checkpoints.

Below is an example structure of these model files.

assets/
├── MagDiff.pth
└── stable-diffusion-2-1-base/
    ├── scheduler/...
    ├── text_encoder/...
    ├── tokenizer/...
    ├── unet/...
    ├── vae/...
    ├── ...
    └── README.md

⚒️ Installation

prerequisites: python>=3.10, CUDA>=11.8.

Install with pip:

pip3 install -r requirements.txt

💃 Inference

Run inference on single GPU:

bash inference.sh

🎓 Citation

If you find this codebase useful for your research, please use the following entry.

@inproceedings{zhao2024magdiff,
    author    = {Zhao, Haoyu and Lu, Tianyi and Gu, Jiaxi and Zhang, Xing and Zheng, Qingping and Wu, Zuxuan and Xu, Hang and Jiang Yu-Gang},
    title     = {MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing},
    booktitle = {European Conference on Computer Vision},
    year      = {2024}
}