README.md
May 25, 2026 · View on GitHub
(ACM MM 2025) OFFSET: Segmentation-based Focus Shift Revision for Composed Image Retrieval
1School of Software, Shandong University2Department of Data Science, City University of Hong Kong,
3School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen),
✉ Corresponding author
Accepted by ACM MM 2025: A novel network designed to address visual inhomogeneity and text-priority biases in Composed Image Retrieval (CIR) through dominant portion segmentation and textually guided focus revision.
📌 Introduction
Welcome to the official repository for OFFSET (Segmentation-based Focus Shift Revision for Composed Image Retrieval).
Existing CIR approaches often overlook the inhomogeneity between dominant and noisy portions in visual data, leading to query feature degradation. Furthermore, they ignore the priority of textual data in the image modification process, resulting in a visual focus bias. OFFSET tackles these limitations using a focus mapping-based feature extractor and a textually guided focus revision module, achieving State-of-the-Art (SOTA) performance across multiple datasets.
📢 News
- [2026-03-20] 🚀 We migrate the all training and evaluation codes of OFFSET from Google Drive to a GitHub repository.
- [2025-07-05] 🔥 OFFSET has been accepted by ACM MM 2025.
- [2024-12-26] 📍 We release the main codes and data of OFFSET!
✨ Key Features
Our framework introduces key innovative modules to achieve precise multimodal semantic alignment:
- 🔍 Dominant Portion Segmentation: Utilizes visual language models to generate image captions as a role-supervised signal, dividing dominant and noisy regions to effectively mask noise information.
- 🔗 Dual Focus Mapping: Features Visual Focus Mapping (VFM) and Textual Focus Mapping (TFM) branches. Guided by the dominant segmentation, it accomplishes adaptive focus mapping on both visual and textual data.
- 🧩 Textually Guided Focus Revision: Utilizes the modification requirements embedded in the textual feature to perform adaptive focus revision on the reference image, enhancing the perception of the modification focus.
- 🏆 SOTA Performance: Demonstrates superior generalization and achieves remarkable improvements across both fashion-domain (FashionIQ, Shoes) and open-domain (CIRR) datasets.
🏗️ Architecture
📊 Experiment Results
OFFSET consistently outperforms existing baselines on widely-used datasets, surpassing strong competitors like DQU-CIR and ENCODER.
1. FashionIQ & Shoes Datasets
(Evaluated using Recall@K)
2. CIRR Dataset
(Evaluated using R@K and R_subset@K)

📑 Table of Contents
- 📌 Introduction
- 📢 News
- ✨ Key Features
- 🏗️ Architecture
- 📊 Experiment Results
- 📂 Repository Structure
- 🚀 Installation
- 📂 Data Preparation
- 🏃♂️ Quick Start
- 📝 Citation
- 🤝 Acknowledgements
- ✉️ Contact
📂 Repository Structure
Our codebase is highly modular. Here is a brief overview of the core files:
OFFSET/
├── cirr_test_submission.py# 📄 CIRR submission file generator
├── datasets.py # 📚 Dataset loader and preprocessing
├── model_OFFSET.py # 🧠 OFFSET model architecture and forward pass
├── test.py # 🧪 Evaluation/Test entry point
├── train.py # 🚀 Training entry point
├── utils.py # 🛠️ Utility functions (metrics, helper methods)
└── README.md # 📝 Documentation and result visualization
This section helps users quickly locate the core components and get started with development.
🚀 Installation
1. Clone the repository
git clone https://github.com/ZivChen-Ty/OFFSET.git
cd OFFSET
2. Setup Environment We recommend using Conda to manage your environment:
conda create -n offset_env python=3.8.10
conda activate offset_env
# Install PyTorch (Ensure it matches your CUDA version. Tested on PyTorch 2.0.0, NVIDIA A40 48G)
pip install torch==2.0.0 torchvision torchaudio --index-url [https://download.pytorch.org/whl/cu118](https://download.pytorch.org/whl/cu118)
# Install required packages
pip install -r requirements.txt
📂 Data Preparation
🛟【OURS】Pre-computed Dominant Portion Segmentation Data (Official Release)
The dominant portion segmentation data of OFFSET is available at Google Drive.
🔥 This is our official released data for result reproduction.
OFFSET is evaluated on FashionIQ, Shoes, and CIRR. Please download the datasets from their official sources and arrange them as follows.
Shoes
Download the Shoes dataset following the instructions in the official repository.
After downloading the dataset, ensure that the folder structure matches the following:
├── Shoes
│ ├── captions_shoes.json
│ ├── eval_im_names.txt
│ ├── relative_captions_shoes.json
│ ├── train_im_names.txt
│ ├── [womens_athletic_shoes | womens_boots | ...]
| | ├── [0 | 1]
| | ├── [img_womens_athletic_shoes_375.jpg | descr_womens_athletic_shoes_734.txt | ...]
FashionIQ
Download the FashionIQ dataset following the instructions in the official repository.
After downloading the dataset, ensure that the folder structure matches the following:
├── FashionIQ
│ ├── captions
| | ├── cap.dress.[train | val | test].json
| | ├── cap.toptee.[train | val | test].json
| | ├── cap.shirt.[train | val | test].json
│ ├── image_splits
| | ├── split.dress.[train | val | test].json
| | ├── split.toptee.[train | val | test].json
| | ├── split.shirt.[train | val | test].json
│ ├── dress
| | ├── [B000ALGQSY.jpg | B000AY2892.jpg | B000AYI3L4.jpg |...]
│ ├── shirt
| | ├── [B00006M009.jpg | B00006M00B.jpg | B00006M6IH.jpg | ...]
│ ├── toptee
| | ├── [B0000DZQD6.jpg | B000A33FTU.jpg | B000AS2OVA.jpg | ...]
CIRR
Download the CIRR dataset following the instructions in the official repository.
After downloading the dataset, ensure that the folder structure matches the following:
├── CIRR
│ ├── train
| | ├── [0 | 1 | 2 | ...]
| | | ├── [train-10108-0-img0.png | train-10108-0-img1.png | ...]
│ ├── dev
| | ├── [dev-0-0-img0.png | dev-0-0-img1.png | ...]
│ ├── test1
| | ├── [test1-0-0-img0.png | test1-0-0-img1.png | ...]
│ ├── cirr
| | ├── captions
| | | ├── cap.rc2.[train | val | test1].json
| | ├── image_splits
| | | ├── split.rc2.[train | val | test1].json
🏃♂️ Quick Start
1. Training the Model
Train OFFSET on Shoes, FashionIQ, or CIRR using the train.py script.
python3 train.py \
--model_dir ./checkpoints/ \
--dataset {shoes, fashioniq, cirr} \
--cirr_path "path/to/CIRR" \
--fashioniq_path "path/to/FashionIQ" \
--shoes_path "path/to/Shoes"
2. Test for CIRR
To generate the predictions file for uploading to the CIRR Evaluation Server using our model, please execute the following command:
python src/cirr_test_submission.py model_path
(Where model_path is the path to the OFFSET model checkpoint on CIRR)
🤝 Acknowledgements
This project builds upon recent advancements in Composed Image Retrieval and Vision-Language pre-training. We express our sincere gratitude to the open-source community for their contributions. Supported in part by the National Natural Science Foundation of China.
✉️ Contact
If you have any questions, feel free to open an issue or reach out to me zivczw@gmail.com ☺️
🔗 Related Projects
Ecosystem & Other Works from our Team
![]() TEMA (ACL'26) Paper | Project | Code |
![]() ConeSep (CVPR'26) Paper | Project | Code | Blog Post (Chinese) |
![]() Air-Know (CVPR'26) Paper | Project | Code | Blog Post (Chinese) |
![]() HABIT (AAAI'26) Project | Code | Paper |
![]() ReTrack (AAAI'26) Project | Code | Paper |
![]() INTENT (AAAI'26) Project | Code | Paper |
![]() HUD (ACM MM'25) Project | Code | Paper |
![]() ENCODER (AAAI'25) Project | Code | Paper |
📝⭐️ Citation
If you find our work or this code useful in your research, please consider leaving a Star⭐️ or Citing📝 our paper 🥰. Your support is our greatest motivation!
@inproceedings{OFFSET,
title = {OFFSET: Segmentation-based Focus Shift Revision for Composed Image Retrieval},
author = {Chen, Zhiwei and Hu, Yupeng and Li, Zixu and Fu, Zhiheng and Song, Xuemeng and Nie, Liqiang},
booktitle = {Proceedings of the ACM International Conference on Multimedia},
pages = {6113–6122},
year = {2025}
}
🫡 Support & Contributing
We welcome all forms of contributions! If you have any questions, ideas, or find a bug, please feel free to:
- Open an Issue for discussions or bug reports.
- Submit a Pull Request to improve the codebase.
📄 License
This project is released under the terms of the LICENSE file included in this repository.







