InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser (ECCV2024)
July 24, 2024 ยท View on GitHub
Xing Cui, Zekun Li, Peipei Li, Huaibo Huang, Xuannan Liu, Zhaofeng He
๐ฉ New Features/Updates
- TODO Release InstaStyle with StableDiffusion v2.1.
- [2024/07/11] Release the code of InstaStyle.
- [2024/07/02] InstaStyle is accepted by ECCV 2024.
Introduction
InstaStyle is a powerful method for stylized image generation. The core idea of InstaStyle is based on the finding that the inversion noise from a stylized reference image inherently carries the style signal. It can perform stylized image generation given only one reference image. Besides, InstaStyle can generate images in a combined sytle and supports adjusting the degree of two styles during combination, demonstrating its flexibility.
๐ฅ๐ฅ๐ฅ Main Features
Stylized image generation with a single reference image
InstaStyle excels at capturing style details including colors, textures, and brush strokes.
Combination of two styles
InstaStyle supports adjusting the degree of two styles during combination, dynamically ranging from one style to another.
๐ง Dependencies and Installation
- Python = 3.11.4
- PyTorch= 2.0.1, torchvision=0.15.2
# create an environment
conda create -n instastyle python==3.11.4
# activate the environment
conda activate instastyle
# install pytorch using pip
# for example: for Linux with CUDA 11.7
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
# install other dependencies
pip install -r requirements.txt
# diffusers
pip install diffusers==0.21.0
# install xformers
pip install -U xformers==0.0.21
โฌ Download Diffuser Models from Hugging Face (Optional)
The diffuser model can be downloaded automatically when the path is specified as "CompVis/stable-diffusion-v1-4", but we recommend that users download the model locally.
By running download.py, the stable diffusion model will be saved to "./stable-diffusion-v1-4"
python download.py
๐ป Quick run
The experiment can be carried out on a NVIDIA GeForce RTX 3090 GPU with 24g memory.
We provide a quick start on gradio demo.
python app.py
Related Works
[1] StyleDrop: Text-to-Image Generation in Any Style
[2] Learning disentangled prompts for compositional image synthesis
๐ค Acknowledgements
We appreciate the foundational work done by Null-Text Inversion and CustomDiffusion. This readme file is modified from Dragon Diffusion and we thank them for their work.
BibTeX
@inproceedings{cui2024instastyle,
title={InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser},
author={Cui, Xing and Li, Zekun and Li, Pei Pei and Huang, Huaibo and Liu, Xuannan and He, Zhaofeng},
booktitle={ECCV},
year={2024}