DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization (CVPR'24)
February 27, 2024 · View on GitHub
This is the official implementation of the paper "DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization" by Jisu Nam, Heesu Kim, DongJae Lee, Siyoon Jin, Seungryong Kim†, and Seunggyu Chang†.

For more information, check out the [project page].
Environment Settings
git clone https://github.com/KU-CVLAB/DreamMatcher.git
cd DreamMatcher
conda env create -f environment.yml
conda activate dreammatcher
pip install -r requirements.txt
cd diffusers
pip install -e .
Pre-trained Weights
You can run DreamMatcher with any off-the-shelf personalized models. We provide pre-trained personalized models of Textual Inversion, DreamBooth, and CustomDiffusion on the ViCo dataset. You can find the pre-trained weights on Link.
Dataset
Image Dataset
For a fair and unbiased evaluation, we used the ViCo image and prompt dataset gathered from Textual Inversion, DreamBooth, and CustomDiffusion. The dataset comprises 16 concepts, including 5 live objects and 11 non-live objects. In the ./inputs folder, you can see 4-12 images of each of the 16 concepts.
Prompt Dataset
We provide the ViCo prompt dataset for live objects in ./inputs/prompts_live_objects.txt and for non-live objects in ./inputs/prompts_nonlive_objects.txt.
Additionally, for evaluation in more complex scenarios, we propose the challenging prompt dataset, which is available in ./inputs/prompts_live_objects_challenging.txt and ./inputs/prompts_nonlive_objects_challenging.txt.
Run DreamMatcher
To run DreamMatcher, select a personalized model from "ti", "dreambooth", or "custom_diffusion" as the baseline. Below, we provide example code using "dreambooth" as the baseline, with 8 samples. Output images for both the baseline and DreamMatcher will be saved in the result directory.
Run DreamMatcher on the ViCo prompt dataset:
python run_dreammatcher.py --models "dreambooth" --result_dir "./results/dreambooth/test" --num_samples 8 --num_device 0 --mode "normal"
Run DreamMatcher on the proposed challenging prompt dataset:
python run_dreammatcher.py --models "dreambooth" --result_dir "./results/dreambooth/test" --num_samples 8 --num_device 0 --mode "challenging"
Evaluation
To evaluate DreamMatcher, specify the result directory containing the result images from both the baseline and DreamMatcher. ICLIP, IDINO, and TCLIP metrics will be calculated.
Evaluation on the ViCo prompt dataset :
python evaluate.py --result_dir "./results/dreambooth/test" --mode "normal"
Evaluation on the proposed challenging prompt dataset :
python evaluate.py --result_dir "./results/dreambooth/test" --mode "challenging"
Collect Evaluation Results
Collect evaluation results for every concept in the result directory:
python collect_results.py --result_dir "./results/dreambooth/test"
Results
Qualitative comparision with baselines for live objects:

Qualitative comparision with baselines for non-live objects:

Qualitative comparison with previous works for live objects:

Qualitative comparison with previous works for non-live objects:

Acknowledgement
We have mainly borrowed code from the public project MasaCtrl. A huge thank you to the authors for their valuable contributions.
BibTeX
If you find this research useful, please consider citing:
@misc{nam2024dreammatcher,
title={DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization},
author={Jisu Nam and Heesu Kim and DongJae Lee and Siyoon Jin and Seungryong Kim and Seunggyu Chang},
year={2024},
eprint={2402.09812},
archivePrefix={arXiv},
primaryClass={cs.CV}
}