README.md

May 25, 2026 · View on GitHub

[CVPR 2026] Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval

Zhiheng Fu¹ Yupeng Hu^1✉, Qianyun Yang¹ Shiqi Zhang¹ Zhiwei Chen¹ Zixu Li¹

¹School of Software, Shandong University
^✉Corresponding author

📌 Introduction

Welcome to the official repository for Air-Know. This is about Noisy Correspondence Learning (NCL) and Composed Image Retrieval (CIR).

Disclaimer: This codebase is intended for research purposes.

📢 News and Updates

[2026-04-22] 🚀 Arxiv version is released.
[2026-04-02] 🚀 All codes are released.
[2026-02-21] 🔥 Air-Know is accepted by CVPR 2026. Codes are coming soon.

Air-Know Pipeline (based on LAVIS)

airknow architecture

Figure 1. The proposed Air-Know consists of three primary modules: (a) External Prior Arbitration leverages an offline multimodal expert to generate reliable arbitration priors for CIR triplets, bypassing the unreliable small-loss hypothesis. (b) Expert-Knowledge Internalization transfers these priors into a lightweight proxy network, structurally preventing the memorization of ambiguous partial matches. Finally, (c) Dual-Stream Reconciliation dynamically integrates the internalized knowledge to provide robust online feedback, guiding the final representation learning. Figure best viewed in color.

Experiment Results
Install
Project Structure
Data Preparation
Quick Start
Acknowledgement
Contact
Related Projects
Citation

🏃‍♂️ Experiment-Results

CIR Task Performance

💡 Note for Fully-Supervised CIR Benchmarking:
🎯 The 0% noise setting in the tables below is equivalent to the traditional fully-supervised CIR paradigm. We highlight this 0% block to facilitate direct and fair comparisons for researchers working on conventional supervised methods.

FashionIQ:

Table 1. Performance comparison on FashionIQ validation set in terms of R@K (%). The best result under each noise ratio is highlighted in bold, while the second-best result is underlined.

CIRR:

Table 2. Performance comparison on the CIRR test set in terms of R@K (%) and Rsub@K (%). The best and second-best results are highlighted in bold and underlined, respectively.

⬆ Back to top

📦 Install

1. Clone the repository

git clone https://github.com/ZhihFu/Air-Know
cd Air-Know

2. Setup Python Environment

The code is evaluated on Python 3.8.10 and CUDA 12.6. We recommend using Anaconda to create an isolated virtual environment:

conda create -n conesep python=3.8
conda activate conesep

# Install PyTorch (The evaluated environment uses Torch 2.1.0 with CUDA 12.1 compatibility)
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url [https://download.pytorch.org/whl/cu121](https://download.pytorch.org/whl/cu121)

# Install core dependencies
pip install scikit-learn==1.3.2 transformers==4.25.0 salesforce-lavis==1.0.2 timm==0.9.16

⬆ Back to top

📂 Project Structure

To help you navigate our codebase quickly, here is an overview of the main components:

├── lavis/                 # Core model directory (built upon LAVIS)
│   └── models/
│       └── blip2_models/
│           └── blip2_cir.py   # 🧠 The core model implementation.
├── train_BLIP2.py        # 🚂 Main training script
├── test_BLIP2.py                # 🧪 General evaluation script
├── cirr_sub_BLIP2.py      # 📤 Script to generate submission files for the CIRR dataset
├── datasets.py            # 📊 Data loading and processing utilities
└── utils.py               # 🛠️ Helper functions (logging, metrics, etc.)

💾 Data Preparation

Before training or testing, you need to download and structure the datasets.

Download the CIRR / FashionIQ dataset from CIRR official repo and FashionIQ official repo.

Organize the data as follows:

1) FashionIQ:

├── FashionIQ
│   ├── captions
|   |   ├── cap.dress.[train | val].json
|   |   ├── cap.toptee.[train | val].json
|   |   ├── cap.shirt.[train | val].json

│   ├── image_splits
|   |   ├── split.dress.[train | val | test].json
|   |   ├── split.toptee.[train | val | test].json
|   |   ├── split.shirt.[train | val | test].json

│   ├── dress
|   |   ├── [B000ALGQSY.jpg | B000AY2892.jpg | B000AYI3L4.jpg |...]

│   ├── shirt
|   |   ├── [B00006M009.jpg | B00006M00B.jpg | B00006M6IH.jpg | ...]

│   ├── toptee
|   |   ├── [B0000DZQD6.jpg | B000A33FTU.jpg | B000AS2OVA.jpg | ...]

2) CIRR:

├── CIRR
│   ├── train
|   |   ├── [0 | 1 | 2 | ...]
|   |   |   ├── [train-10108-0-img0.png | train-10108-0-img1.png | ...]

│   ├── dev
|   |   ├── [dev-0-0-img0.png | dev-0-0-img1.png | ...]

│   ├── test1
|   |   ├── [test1-0-0-img0.png | test1-0-0-img1.png | ...]

│   ├── cirr
|   |   ├── captions
|   |   |   ├── cap.rc2.[train | val | test1].json
|   |   ├── image_splits
|   |   |   ├── split.rc2.[train | val | test1].json

(Note: Please modify datasets.py if your local data paths differ from the default setup.)

⬆ Back to top

🚀 Quick Start

1. Training under Noisy Settings

In our implementation, we introduce the noise_ratio parameter to simulate varying degrees of NTC (Noisy Triplet Correspondence) interference. You can reproduce the experimental results from the paper by modifying the --noise_ratio parameter (default options evaluated are 0.0, 0.2, 0.5, 0.8).

Training on FashionIQ:

python train_BLIP2.py \
    --dataset fashioniq \
    --fashioniq_path "/path/to/FashionIQ/" \
    --model_dir "./checkpoints/fashioniq_noise0.8" \
    --noise_ratio 0.8 \
    --batch_size 256 \
    --num_epochs 20 \
    --lr 1e-5

Training on CIRR:

python train_BLIP2.py \
    --dataset cirr \
    --cirr_path "/path/to/CIRR/" \
    --model_dir "./checkpoints/cirr_noise0.8" \
    --noise_ratio 0.8 \
    --batch_size 256 \
    --num_epochs 20 \
    --lr 2e-5

2. Testing

To generate the prediction files on the CIRR dataset for submission to the CIRR Evaluation Server, run the following command:

python src/cirr_test_submission.py checkpoints/cirr_noise0.8/

(The corresponding script will automatically output .json based on the generated best checkpoints in the folder for online evaluation.)

⬆ Back to top

🙏 Acknowledgements

This codebase is heavily inspired by and built upon the excellent Salesforce LAVIS, SPRC and TME library. We thank the authors for their open-source contributions.

⬆ Back to top

✉️ Contact

For any questions, issues, or feedback, please open an issue on GitHub or reach out to us at fuzhiheng8@gmail.com

⬆ Back to top

Ecosystem & Other Works from our Team

TEMA (ACL'26) Paper \| Project \| Code	ConeSep (CVPR'26) Paper \| Project \| Code \| Blog Post (Chinese)	HABIT (AAAI'26) Paper \| Project \| Code
ReTrack (AAAI'26) Paper \| Project \| Code	INTENT (AAAI'26) Paper \| Project \| Code	HUD (ACM MM'25) Paper \| Project \| Code
OFFSET (ACM MM'25) Paper \| Project \| Code	ENCODER (AAAI'25) Paper \| Project \| Code

📝⭐️ Citation

If you find our work or this code useful in your research, please consider leaving a Star⭐️ or Citing📝 our paper 🥰. Your support is our greatest motivation!

@InProceedings{Air-Know,
    title={Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval},
    author={Fu, Zhiheng and Hu, Yupeng and Qianyun Yang and Shiqi Zhang and Chen, Zhiwei and Li, Zixu},
    booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    year = {2026}
}

⬆ Back to top

📄 License

This project is released under the terms of the LICENSE file included in this repository.

If this project helps you, please leave a Star!

TEMA (ACL'26) Paper \| Project \| Code	ConeSep (CVPR'26) Paper \| Project \| Code \| Blog Post (Chinese)	HABIT (AAAI'26) Paper \| Project \| Code
ReTrack (AAAI'26) Paper \| Project \| Code	INTENT (AAAI'26) Paper \| Project \| Code	HUD (ACM MM'25) Paper \| Project \| Code
OFFSET (ACM MM'25) Paper \| Project \| Code	ENCODER (AAAI'25) Paper \| Project \| Code