Bayesian Non-parametrics for Out-of-distribution Detection
February 14, 2025 · View on GitHub
Bayesian non-parametrics are a natural solution for out-of-distribution (OOD) detection problems as they model the probabilty that a sample is generated from a unknown cluster (i.e. an outlier). Here we provide implementations of hierarchical Dirichlet process mixture models (DPMM) with Gaussian likelihoods for OOD detection. We provide expectation maximization methods for efficient inference and demonstrate their effectiveness of our approach on the OpenOOD benchmark. We analyze the covariance structure of the ViT-B/16 (DeiT) features for the Imagenet dataset which motivates the application of hierarchical DPMMs and the coupled hierarchical DPMM with diagonal covariance here. We also generate synthetic datasets to demonstate the performance of our approach in different data regimes here. For full details on our approach and experiments, please see our paper "A Bayesian Nonparametric Perspective on Mahalanobis Distance for Out of Distribution Detection".
Imagenet Dataset Analysis
The covariance analysis figures discussed in the paper can be generated using the notebook ImagenetDataAnalysis.ipynb. We highlight that the diagonal elements of the empirical covariance matrices are scaled up or down versions of their average . This analysis motivates the coupled hierarchical DPMM with diagonal covariance.
Synthetic Data Experiments
We sweep over the parameter of the NIW prior used to generate the synthetic data to demonstrate the sensitivity of different models to how tied the class covariances are, as shown in the figure below.
We also sweep over the number of samples per class . We see that compared to the independent RMDS model, the hierarchical DPMMs are more robust to small .
The experiments can be recreated in the notebook SyntheticExperiments.ipynb.
OpenOOD Experiments
| Near | Far | |||||||
| Model | Accuracy | SSB Hard | NINCO | Avg. | iNaturalist | OpenImageO | Textures | Avg. |
| MSP | 80.89 | 71.75 | 79.87 | 75.81 | 88.66 | 85.62 | 84.62 | 86.30 |
| Temp. MSP | 80.89 | 73.29 | 81.27 | 77.28 | 91.23 | 87.81 | 86.78 | 88.61 |
| MDS | 80.41 | 71.45 | 86.48 | 78.97 | 96.00 | 92.34 | 89.38 | 92.57 |
| RMDS | 80.41 | 72.79 | 87.28 | 80.03 | 96.09 | 92.29 | 89.38 | 92.59 |
| Hierarchical DPMMs | ||||||||
| Tied | 80.41 | 71.80 | 86.76 | 79.28 | 96.00 | 92.40 | 89.72 | 92.70 |
| Full | 76.79 | 62.84 | 78.48 | 70.66 | 85.88 | 85.03 | 88.02 | 86.31 |
| Diag. | 76.54 | 73.89 | 87.32 | 80.60 | 95.36 | 90.78 | 86.41 | 90.85 |
| Coupled | 76.51 | 74.47 | 87.48 | 80.98 | 95.51 | 90.63 | 86.02 | 90.72 |
Scripts are provided in scripts/ to reproduce all of the OpenOOD experiments in the paper. The scripts save results with the following directory structure:
bnp4ood/
openood_exps/
{MODEL_NAME}/ # Model names: mds-rmds, full, tied, diag, coupled_diag
logs/
results/
The OpenOOD_Results notebook generates the tables presented in the paper from the saved results.
Installation
Ensure that the required python packages in requirements.txt are installed.
Vision Transformer Features
Downloading Features
We provide the features we generated from the OpenOOD experiments in our release
here. Before running the experiments, you
need to combine vit-b-16-img1k-feats-part*.pkl by running the script:
python combine_partial_feats.py --feats-file-prefix vit-b-16-img1k-feats
rm vit-b-16-img1k-feats-part*.pkl
Regenerating Features
To generate the features for the OpenOOD experiments, you will need to install the OpenOOD benchmark available here. We provide a script to extract the features from the OpenOOD benchmark, available here, that saves the features in the following format:
# ID Datasets
{MODEL_NAME}-img1k-feats.pkl # Train
{MODEL_NAME}-img1k-{ID_SPLIT}-feats.pkl # ID Splits: val, test
# OOD Datasets
# OOD Granularities: near, far
# OOD Datasets: ssb_hard, ninco, inaturalist, textures, openimages-o
{MODEL_NAME}-img1k-{OOD_GRANULARITY}_{DATASETNAME}-feats.pkl
where dataset names are the lowercase names of each dataset. To run this script,
copy it to the OpenOOD/scripts/ directory and run the following command:
python openood/extract_imagenet_features.py --model_name {MODEL_NAME}
Once the features have been saved, link or copy them to this repository with the same names.