NegRefine: Refining Negative Label-Based Zero-Shot OOD Detection

December 9, 2025 Β· View on GitHub

Paper ICCV 2025 License

Official implementation of NegRefine, accepted to ICCV 2025.

πŸ“„ Paper on arXiv

NegRefine improves negative label-based zero-shot OOD detection by:

  • Filtering subcategories and proper nouns from the negative label set using an LLM
  • Multi-matching-aware scoring that accounts for images matching multiple labels

With these improvements, NegRefine achieves state-of-the-art results on large-scale ImageNet-1K benchmark.

πŸ“‚ Code Overview

The repository is structured as follows:

neg_refine/
β”œβ”€ data/                     # Dataset root (add datasets here)
β”œβ”€ output/                   # Save folder for outputs and results per dataset/seed
β”‚  └─ imagenet/seed_0/       # Example folder for ImageNet with seed 0
β”œβ”€ scripts/                  # Bash scripts for running experiments
β”‚  └─ ...
β”œβ”€ src/                      # Python source code
β”‚  β”œβ”€ class_names.py         # Dataset class names and prompt templates
β”‚  β”œβ”€ clip_ood.py            # Main method for CLIP-based zero-shot OOD detection
β”‚  β”œβ”€ create_negs.py         # Generates initial negative labels (CSP-based)
β”‚  β”œβ”€ eval.py                # Entry point for experiments and evaluation
β”‚  β”œβ”€ neg_filter.py          # LLM-based refinement of negative labels
β”‚  └─ ood_evaluate.py        # OOD evaluation metrics (AUROC, FPR@95, etc.)
β”œβ”€ txtfiles/                 # WordNet lexicon text files (adjectives/nouns)
β”‚  └─ ...

βš™οΈ Environment Setup

This project was developed with Python 3.10.12 and PyTorch 2.6.0 on Ubuntu 22.04.

πŸ“¦ Dataset Downloads

Below are the sources for downloading the datasets used in our experiments:

  • ImageNet-1K: Download from the ImageNet Challenge 2012 website. Only the validation data is required.

  • NINCO & Clean: Available from the NINCO GitHub. The provided .tar.gz file includes both: NINCO dataset (NINCO_OOD_classes) and Clean Collection (NINCO_popular_datasets_subsamples, obtained through manual analysis of random samples from 11 common OOD datasets).

  • OpenImage-O: Can be downloaded from OpenOOD using the provided download script.

  • ImageNet-10, ImageNet-20, ImageNet-100: Refer to the MCM GitHub for instructions to create these subsets of ImageNet-1K classes.
    Note: In our experiments, we modified ImageNet-100 to create ImageNet-99 by removing the β€œrace car” class (class n04037443).

  • iNaturalist, SUN, Places, Textures: Download links available on the MOS GitHub.

  • CUB-200, Stanford Cars, Food-101, Oxford Pets: Download links available on the MCM GitHub.

  • Waterbirds (Spurious OOD): Refer to this MCM GitHub issue.

After downloading, place all datasets in the data/ folder.
Refer to (or modify) the load_dataset() function in src/eval.py for the exact folder structure and naming conventions used for data loading.

πŸš€ Running Experiments

The script to run each experiment from the main paper is provided in the scripts/ folder.
Scripts are named after the in-distribution datasets used in the experiments.

For example, to reproduce the ImageNet-1K benchmark, run:

sh scripts/imagenet.sh

The results of each experimentβ€”including evaluation metrics, logs, and negative label filesβ€”will be saved in the output/ folder.

πŸ“Š Example Results

As an illustration, we provide the saved results for ImageNet-1K with seed 0, available in output/imagenet/seed_0/. These include the saved negative labels, LLM refinement logs, and final evaluation results.

Results (In-Distribution: ImageNet-1K, Seed 0):

OOD DatasetAUROC (%)FPR@95 (%)
⭐ iNaturalist99.571.51
⭐ OpenImage-O95.0224.03
⭐ Clean90.7033.04
⭐ NINCO81.9062.11
SUN94.6422.93
Places90.4239.10
Textures94.6921.15

Note: Only the first four datasets are considered valid OOD data and are included in the main paper results, as they contain minimal or no in-distribution contamination. In contrast, SUN, Places, and Textures contain notable overlap with ImageNet-1K classes, leading to in-distribution contamination. For further discussion, refer to our paper and the NINCO paper.

The table above shows results for ImageNet-1K with seed 0.
For the complete set of experiments and results, averaged over 10 seeds, please refer to our main paper.

πŸ™ Acknowledgements

Our code is built on the excellent work of CSP and NegLabel. We sincerely thank the authors.

πŸ“– Citation

If you find this work useful in your research, please consider citing our paper:

@inproceedings{ansari2025negrefine,
  title={NegRefine: Refining Negative Label-Based Zero-Shot OOD Detection},
  author={Ansari, Amirhossein and Wang, Ke and Xiong, Pulei},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={573--582},
  year={2025}
}