README.md

October 22, 2024 · View on GitHub

Concept Conductor

Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis

Zebin Yao, Fangxiang Feng, Ruifan Li, Xiaojie Wang

Beijing University of Posts and Telecommunications

🔍 Results

Combination of 2 Concepts:

Combination of More Than 2 Concepts:

🛠️ Installation

git clone https://github.com/Nihukat/Concept-Conductor.git
cd Concept-Conductor
pip install -r requirements.txt

📝 Preparation

1. Download Pretrained Text-to-Image Models.

We implemented our method on both Stable Diffusion 1.5 and SDXL 1.0 respectively.

For Stable Diffusion 1.5, we adopt ChilloutMix for real-world concepts and Anything-v4 for anime concepts.

cd experiments/pretrained_models

# Diffusers-version ChilloutMix
git-lfs clone https://huggingface.co/windwhinny/chilloutmix.git

# Diffusers-version Anything-v4
git-lfs clone https://huggingface.co/xyn-ai/anything-v4.0.git

For SDXL 1.0, we adopt RealVisXL V5.0 for real-world concepts and Anything-XL for anime concepts.

cd experiments/pretrained_models

# Diffusers-version RealVisXL V5.0
git-lfs clone https://huggingface.co/SG161222/RealVisXL_V5.0.git

# Diffusers-version Anything-XL
git-lfs clone https://huggingface.co/eienmojiki/Anything-XL.git

2. (Optional) Train ED-LoRAs.

We adopt ED-LoRAs (proposed in Mix-of-Show) as single-concept customization models. If you want to train ED-LoRAs yourself, you can download the training data we used in our paper on Google Drive.

You can also construct personalized concept datasets with your own custom images and corresponding text captions, referring to the structure of our dataset directory.

We provide training scripts for both Stable Diffusion 1.5 and SDXL 1.0.

For Stable Diffusion 1.5 :

# Train ED-LoRAs for real-world concepts
python train_edlora.py -opt configs/edlora/train/chow_dog.yml

# Train ED-LoRAs for anime concepts
python train_edlora.py -opt configs/edlora/train/mitsuha_girl.yml

For SDXL 1.0 :

# Train ED-LoRAs for real-world concepts
python train_edlora_sdxl.py -opt configs/edlora/train_sdxl/chow_dog.yml

# Train ED-LoRAs for anime concepts
python train_edlora_sdxl.py -opt configs/edlora/train_sdxl/mitsuha_girl.yml

3. (Optional) Download our trained ED-LoRAs.

To quickly reimplement our method, you can download our trained ED-LoRAs from Google Drive.

🚀 Usage

Generate multiple personalized concepts in an image

For Stable Diffusion 1.5 :

python sample.py \
--ref_prompt "A dog and a cat in the street." \
--base_prompt "A dog and a cat on the beach." \
--custom_prompts "A <chow_dog_1> <chow_dog_2> on the beach." "A <siberian_cat_1> <siberian_cat_2> on the beach."\
--ref_image_path "examples/a dog and a cat in the street.png" \
--ref_mask_paths "examples/a dog and a cat in the street_mask1.png" "examples/a dog and a cat in the street_mask2.png" \
--edlora_paths "experiments/ED-LoRAs/real/chow_dog.pth" "experiments/ED-LoRAs/real/siberian_cat.pth" \
--start_seed 0 \
--batch_size 4 \
--n_batches 1

You can also pass parameters using a configuration file (like ./configs/sample_config.yaml) :

python sample.py --config_file "path/to/your/config.yaml"

For SDXL 1.0 :

python sample_sdxl.py \
--ref_prompt "A cat on a stool and a dog on the floor." \
--base_prompt "A cat on a stool and a dog on the floor." \
--custom_prompts "A <siberian_cat_1> <siberian_cat_2> on a stool and a <siberian_cat_1> <siberian_cat_2> on the floor." "A <chow_dog_1> <chow_dog_2> on a stool and a <chow_dog_1> <chow_dog_2> on the floor."\
--ref_image_path "examples/a cat on a stool and a dog on the floor.png" \
--ref_mask_paths "examples/a cat on a stool and a dog on the floor_mask1.png" "examples/a cat on a stool and a dog on the floor_mask2.png" \
--edlora_paths "experiments/SDXL_ED-LoRAs/real/siberian_cat.pth" "experiments/SDXL_ED-LoRAs/real/chow_dog.pth" \
--start_seed 0 \
--batch_size 1 \
--n_batches 4

You can also pass parameters using a configuration file (like ./configs/sample_config_sdxl.yaml) :

python sample_sdxl.py --config_file "path/to/your/config.yaml"

✅ To-Do List

Create a gradio demo.
Add more usage and applications.
Add support for SDXL.
Release the training data and trained models.
Release the source code.

📚 Citation

If you find this code useful for your research, please consider citing:

@article{yao2024concept,
  title={Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis},
  author={Yao, Zebin and Feng, Fangxiang and Li, Ruifan and Wang, Xiaojie},
  journal={arXiv preprint arXiv:2408.03632},
  year={2024}
}