README.md

April 19, 2026 · View on GitHub

ContextGen: Contextual Layout Anchoring
for Identity-Consistent Multi-Instance Generation

Ruihang Xu, Dewei Zhou, Fan Ma^†, Yi Yang
ReLER Lab, CCAI, Zhejiang University

🔥 Updates

2026.1.27: Our paper has been accepted by ICLR 2026! 🎉
2025.12.14: Released the IMIG-100K Dataset.
2025.12.8: Released the inference code, training code, pretrained model weights, and GUI support for ContextGen.
2025.10.19: Released the IMIG-Dataset construction pipeline.

ContextGen is a novel framework that uses user-provided reference images to generate image with multiple instances, offering layout control over their positions while guaranteeing identity preservation.

✅ To-Do List

🚀 Quick Start

Environment Setup

conda create contextgen python=3.12 -y
conda activate contextgen
pip install -r requirements.txt

Download Pretrained Models

Download FLUX.1-Kontext and ContextGen Adapter. Configure the weight paths in a .env file and place it in the root directory. You can refer to the .env_template file. The format is as follows:

KONTEXT_MODEL_PATH="path_to_kontext_model"
ADAPTER_PATH="path_to_contextgen_adapter"

Inference

⚠️ GPU Memory Note: The inference process requires ~35-40GB GPU memory. We're working on quantization and optimization to reduce the memory footprint in future releases.

To run inference on the provided demos, simply execute:

python inference.py

The generated results will be saved in the images/output folder.

For Custom Input: You can add your own images in the images/input folder and modify the inference.py file accordingly.
More Demos: More interesting demos and results can be found on our Project Page.
Recommended Interaction: For easier interaction, we highly recommend using our GUI Support.

Training

You can customize your own dataset by referencing the IMIG-Dataset construction code. Remember to add your WANDB API key in the .env file for experiment tracking:

WANDB_API_KEY="your_wandb_api_key"

Then configure the training parameters in train/config/config.yaml and run:

python src/model/train.py

Benchmark and Evaluation

We have released LAMICBench++ evaluation support and sample generation scripts for three benchmarks: LAMICBench++, COCO-MIG, and LayoutSAM. For LAMICBench++ evaluation scripts and metric aggregation, see bench/lamicbench_plus/README.md. You can find the sample generation entry points and reproduction notes in bench/sample_scripts/README.md.

GUI Support

We provide a simple GUI built with Vite and React for easier interaction.

1. Model Dependencies & Setup

The GUI requires additional models. Please download them and set their full paths in the .env file:

Image Cutout (Required): Download the BEN2 model.
```
BEN_CKPT_PATH="path_to_ben2_model"
```
Asset Generation from Text (Optional): Download the FLUX.1-dev model.
```
FLUX_MODEL_PATH="path_to_flux_model"
```

⚠️ GPU Memory Note: Using the optional asset generation feature consumes an additional ~30GB GPU memory. If your single GPU memory is limited, consider loading this model on a different GPU. If you do not require this feature, you can comment out the related code in gui/backend/app.py.

2. NodeJS Dependencies

If you don't have Node.js and npm installed, you can install them as follows:

# install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
# restart your terminal to load nvm, then install Node.js
nvm install --lts

3. Launching the GUI

To build and run the demo, follow these steps:

Start Frontend: In the first terminal, run the following commands:

cd gui/frontend
npm install # for the first time only
npm run dev

Start Backend: Open a second terminal and run the backend server:
```
python gui/backend/app.py
```

Accessing the GUI

Once both the frontend and backend servers are successfully launched, if you are working on a remote server, port forwarding is required. Please ensure the frontend port (127.0.0.1:5173) and the backend port (127.0.0.1:5000) are forwarded to the corresponding ports on your local machine. You can then access the GUI via your local browser at http://localhost:5173. Here’s a quick preview of the interface:

💡 Tips

For better identity rendering and visual quality, we recommend using a middle resolution (e.g., 768x768). This strikes a balance, as higher resolutions may compromise identity consistency, while lower resolutions can introduce artifacts.
To enhance visual quality and contextual consistency, we recommend using a richer prompt that includes detailed, interactive relationships between the instances.
If a generated case fails or exhibits poor quality, please try again with a different random seed.

🎉 Enjoy Using ContextGen! 🎉

📭 Citation

If you find ContextGen helpful to your research, please consider citing our paper:

@inproceedings{
    xu2026contextgen,
    title={ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation},
    author={Ruihang Xu and Dewei Zhou and Fan Ma and Yi Yang},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026}
}