LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing

November 7, 2025 · View on GitHub

LOTS

This is the official implementation of the LOTS adapter from the paper "LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing", published as Oral at ICCV25 in Honolulu.

To access the Sketchy dataset, refer to the HuggingFace repository

Road Map

Code release
Weights release
Platform release

Repository Structure

ckpts folder

Contains the pre-trained weights of the LOTS adapter.

scripts folder

Contains all the scripts for training and inference with LOTS on Sketchy.

src folder

Contains all the source code for the classes, models, and dataloaders used in the scripts.

Installation

We advise creating a Conda environment as follows

conda create -n lots python=3.12
conda activate lots
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
pip install -e .

Unzip the pre-trained weights and config

cd ckpts
unzip lots.zip
cd ..

Training

We provide the script to train LOTS on our Sketchy dataset in scripts/lots/train_lots.py. For an example of usage, check run_train.sh, which contains the default parameters used in our experiments.

Inference

You can test our pre-trained model with the inference script in scripts/lots/inference_lots.py. For an example, check run_inference.sh. This script generates an image for each item in the test split of Sketchy, and saves them in a structured folder, with each item identified by its unique ID.

Citation

If you find our work useful, please cite our work:

@inproceedings{girella2025lots,
  author    = {Girella, Federico and Talon, Davide and Lie, Ziyue and Ruan, Zanxi and Wang, Yiming and Cristani, Marco},
  title     = {LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing},
  journal   = {Proceedings of the International Conference on Computer Vision},
  year      = {2025},
}