[ICCV2025] RareCLIP: Rarity-aware Online Zero-shot Industrial Anomaly Detection (Paper Link)
October 20, 2025 · View on GitHub
Poster
Abstract
Large vision-language models such as CLIP have made significant strides in zero-shot anomaly detection through prompt engineering. However, most existing methods typically process each test image individually, ignoring the practical rarity of abnormal patches in real-world scenarios. Although some batch-based approaches exploit the rarity by processing multiple samples concurrently, they generally introduce unacceptable latency for real-time applications. To mitigate these limitations, we propose RareCLIP, a novel online zero-shot anomaly detection framework that enables sequential image processing in real-time without requiring prior knowledge of the target domain. RareCLIP capitalizes on the zero-shot capabilities of CLIP and integrates a dynamic test-time rarity estimation mechanism. A key innovation of our framework is the introduction of a prototype patch feature memory bank, which aggregates representative features from historical observations and continuously updates their corresponding rarity measures. For each incoming image patch, RareCLIP computes a rarity score by aggregating the rarity measures of its nearest neighbors within the memory bank. Moreover, we introduce a prototype sampling strategy based on dissimilarity to enhance computational efficiency, as well as a similarity calibration strategy to enhance the robustness of rarity estimation. Extensive experiments demonstrate that RareCLIP attains state-of-the-art performance with 98.2% image-level AUROC on MVTec AD and 94.4% on VisA, while achieving a latency of 59.4 ms.
Installation
Create a new conda environment and install required packages.
conda create -n RareCLIP python=3.10
conda activate RareCLIP
pip install -r requirements.txt
Experiments are conducted on NVIDIA GeForce RTX 3090 (24GB). Same GPU and package version are recommended.
Dataset Preparation
MVTec AD
Download and extract MVTec AD into ../dataset/mvtec.
data
├── mvtec
├── bottle
├── train
├── good
├── 000.png
├── test
├── good
├── 000.png
├── anomaly1
├── 000.png
├── ground_truth
├── anomaly1
├── 000.png
├── cable
├── capsule
├── ....
VisA
Download and extract VisA into ../dataset/visa.
data
├── visa
├── candle
├── Data
├── Images
├── Anomaly
├── 000.JPG
├── Normal
├── 0000.JPG
├── Masks
├── Anomaly
├── 000.png
├── capsules
├── cashew
├── ...
├── split_csv
├── 1cls.csv
Test
Set parameters in test.sh.
test: name of the test dataset, optional: mvtec, visatest_set_path: the path to the test datasetsave_path: the path to save logseed: the random seedload_path: the path to load TPB weight (use default provided weight or retrain yourself)
Then run the following command:
sh test.sh
Train
Train on MVTec AD:
python train.py --train mvtec --train_set_path ../dataset/mvtec --save_path ./mvtec_pretrained
Train on VisA:
python train.py --train visa --train_set_path ../dataset/visa --save_path ./visa_pretrained
Results
Comparison of image-level and pixel-level zero-shot anomaly detection on the MVTec AD and VisA datasets.

Acknowledgements
We appreciate the inspiration provided by APRIL-GAN and MuSc for our work.
Citation
If our work is helpful for your research, please consider citing:
@inproceedings{he2025RareCLIP,
title={RareCLIP: Rarity-aware Online Zero-shot Industrial Anomaly Detection},
author={He, Jianfang and Cao, Min and Peng, Silong and Xie, Qiong},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
pages={24478-24487},
year={2025}
}