README.md
June 11, 2025 · View on GitHub
ByteMorph: Benchmarking Instruction-Guided Image Editing
with Non-Rigid Motions
Di Chang1,2*
·
Mingdeng Cao1,3*
·
Yichun Shi1
·
Bo Liu1,4
·
Shengqu Cai1,5
·
Shijie Zhou6
Weilin Huang1
·
Gordon Wetzstein5
·
Mohammad Soleymani2
·
Peng Wang1
1ByteDance Seed 2Unviersity of Southern California 3University of Tokyo
4University of California Berkeley 5Stanford University 6University of California Los Angeles
* denotes equal contribution
This repo is the official pytorch implementation of ByteMorph, include training, inference and evaluation.
📢 News
- June 03, 2025: We released the official website, dataset, benchmark, online-demo, and paper for ByteMorph.
📜 Requirements
- An NVIDIA GPU with CUDA support is required for inference.
- We have tested on a single A100 and H100 GPU.
- In our experiment, we used CUDA 12.4.
- Feel free to visit Flux.1-dev for further details on environment.
🛠️ Dependencies and Installation
Clone the repository:
git clone https://github.com/Boese0601/ByteMorph
cd ByteMorph
Installation Guide
We provide an requirements.txt file for setting up the environment.
Run the following command on your terminal:
# 1. Prepare conda environment
conda create -n bytemorph python=3.10
# 2. Activate the environment
conda activate bytemorph
# 3. Install dependencies
bash env_install.sh
🧱 Download Pretrained Models
We follow the implementation details in our paper and release pretrained weights of the Diffusion Transformer in this huggingface repository. After downloading, please put it under the pretrained_weights folder.
The Flux.1-dev VAE and DiT can be found here. The Google-T5 encoder can be found here. The CLIP encoder can be found here.
Please place them under ./pretrained_weights/.
Your file structure should look like this:
ByteMorph
|----...
|----pretrained_weights
|----models--black-forest-labs--FLUX.1-dev
|----flux1-dev.safetensors
|----ae.safetensors
|----...
|----models--xlabs-ai--xflux
|----...
|----models--openai--clip-vit-large-patch14
|----...
|----ByteMorpher
|----dit.safetensors
|----...
Train and Inference
Using Command Line
cd ByteMorph
# Train
bash scripts/train/train.sh
# Inference
bash scripts/test/inference.sh
The config files for trainig and inference can be found in this file and this file.
The DeepSpeed config file for training is here.
Evaluation
Please visit this page.
🔗 BibTeX Citation
If you find ByteMorph useful for your research and applications, please cite ByteMorph using this BibTeX:
@article{chang2025bytemorph,
title={ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions},
author={Chang, Di and Cao, Mingdeng and Shi, Yichun and Liu, Bo and Cai, Shengqu and Zhou, Shijie and Huang, Weilin and Wetzstein, Gordon and Soleymani, Mohammad and Wang, Peng},
journal={arXiv preprint arXiv:2506.03107},
year={2025}
}
License
This code is distributed under the FLUX.1-dev Non-Commercial License. See LICENSE.txt file for more information.
Acknowledgement
We would like to thank the contributors to the Flux.1-dev, x-flux, OminiControl, for their open-source research.
Disclaimer
Your access to and use of this dataset are at your own risk. We do not guarantee the accuracy of this dataset. The dataset is provided “as is” and we make no warranty or representation to you with respect to it and we expressly disclaim, and hereby expressly waive, all warranties, express, implied, statutory or otherwise. This includes, without limitation, warranties of quality, performance, merchantability or fitness for a particular purpose, non-infringement, absence of latent or other defects, accuracy, or the presence or absence of errors, whether or not known or discoverable. In no event will we be liable to you on any legal theory (including, without limitation, negligence) or otherwise for any direct, special, indirect, incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or damages arising out of this public license or use of the licensed material.The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability.