README.md

February 3, 2026 · View on GitHub

Debiased Orthogonal Boundary-Driven Efficient Noise Mitigation

Hao Li¹^*, Jiayang Gu¹^*, Jingkuan Song¹^†, An Zhang², Lianli Gao¹,

¹University of Electronic Science and Technology of China

²National University of Singapore

The official implementation of the paper "Debiased Orthogonal Boundary-Driven Efficient Noise Mitigation".

Requirements

We recommend the following dependencies.

Python 3.8
PyTorch 1.13.1

Then, please install other environment dependencies through:

pip install -r requirements.txt

The recommended GPU memory is at least 24 GB.

⚙️ Dataset Preparation

Annotation Preparation

We follow the same split provided by NPC.

You should first download and extract the formulated datasets directory, which contains all the annotation files. You can access them here.

Our formulated datasets directory tree looks like:

${DATASETS}/
├── MSCOCO/
│    ├── annotations/
│    │       ├── 0.0_clean_index.npy
│    │       ├── 0.0_noise_train_caps.txt
│    │       ├── 0.2_clean_index.npy
│    │       ├── ...
│    │       ├── train_caps.txt
│    │       ├── train_ids.txt
│    │       ├── dev_caps.txt
│    │       └── ...
│    └── images/ # empty
│
├── Flickr30K/
│    ├── annotations
│    │       ├── 0.0_clean_index.npy
│    │       ├── 0.0_noise_train_caps.txt
│    │       ├── 0.2_clean_index.npy
│    │       ├── ...
│    │       ├── train_caps.txt
│    │       ├── train_ids.txt
│    │       ├── dev_caps.txt
│    │       └── ...
│    └── images/
│
└── CC120K/
     ├── annotations
     │       ├── train_caps.txt
     │       ├── train_ids.txt
     │       ├── dev_caps.txt
     │       └── ...
     └── images/ # empty

The images directory are still empty. You need to download the images separately and place them directly in the corresponding dataset's images folder. We adopt the same image download and processing method as NPC.

Download Link

MSCOCO. We unified the images' name format of the MSCOCO dataset for easier use. You can use util.py to rename the images in MSCOCO.
Flickr30K.
CC120K. You can download the dataset from this link with the extraction code "3ble".

🔥 Training

The following are the training instructions for various datasets. Please set the ${DATASETS} to previously configured datasets folder. You can specify the ${SAVE_PATH}$ to the model path you'd like to save to.

Training on MS-COCO: You can adjust the noise ratio in the training set by changing ${NOISE_RATIO}$ , which can be selected from the following values: [0.0, 0.2, 0.4, 0.5, 0.6].

python main_clip.py --batch_size 256 --epochs 5 --lr 1e-5 --warmup 500 --vision_model ViT-B/32 --dataset coco --dataset_root ${DATASETS}$/MSCOCO --checkpoint_path ${SAVE_PATH}$ --noise_ratio ${NOISE_RATIO}$

Training on Flickr30K:

You can adjust the noise ratio in the training set by changing ${NOISE_RATIO}$ , which can be selected from the following values: [0.0, 0.2, 0.4, 0.6].

python main_clip.py --batch_size 256 --epochs 5 --lr 1e-5 --warmup 500 --vision_model ViT-B/32 --dataset f30k --dataset_root ${DATASETS}$/Flickr30K --checkpoint_path ${SAVE_PATH}$ --noise_ratio ${NOISE_RATIO}$

Training on CC120K:

CC120K is a real-world noisy dataset, so the noise ratio is not need to be specified.

python main_clip.py --batch_size 256 --epochs 10 --lr 1e-5 --warmup 500 --vision_model ViT-B/32 --dataset cc --dataset_root ${DATASETS}$/CC120K --checkpoint_path ${SAVE_PATH}$

python main_clip.py --eval --vision_model ViT-B/32 --dataset coco --dataset_root ${DATASETS}$/MSCOCO --resume ${MODEL_PATH}$

Evaluation on Flickr30K:

python main_clip.py --eval --vision_model ViT-B/32 --dataset f30k --dataset_root ${DATASETS}$/Flickr30K --resume ${MODEL_PATH}$

Evaluation on CC120K:

python main_clip.py --eval --vision_model ViT-B/32 --dataset cc --dataset_root ${DATASETS}$/CC120K --resume ${MODEL_PATH}$

Reference

If you found this work is useful for you, we appreciate that if you can cite the following paper:

@inproceedings{OSA,
  author    = {Hao Li and
               Jiayang Gu and
               Jingkuan Song and
               An Zhang and
               Lianli Gao},
  title     = {Debiased Orthogonal Boundary-Driven Efficient Noise Mitigation},
  journal = {arXiv preprint: 2410.01944},
  year      = {2024}
}

README.md

Debiased Orthogonal Boundary-Driven Efficient Noise Mitigation

Requirements

⚙️ Dataset Preparation

Annotation Preparation

Image Preparation

🔥 Training

📋 Evaluation

Experimental Results

Evaluation Instuctions

Reference