[ICCV 2025] HOLa: Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation

January 8, 2026 ยท View on GitHub

arXiv

Project Page

Dataset

Follow the process of UPT.

The downloaded files should be placed as follows. Otherwise, please replace the default path to your custom locations.

|- HOLa
|   |- hicodet
|   |   |- hico_20160224_det
|   |       |- annotations
|   |       |- images
|   |- vcoco
|   |   |- mscoco2014
|   |       |- train2014
|   |       |-val2014
:   :      

Dependencies

  1. Follow the environment setup in UPT.

  2. Follow the environment setup in ADA-CM.

Reminder: If you have already installed the clip package in your Python environment (e.g., via pip install clip), please ensure that you use the local CLIP directory provided in our EZ-HOI repository instead. To do this, set the PYTHONPATH to include the local CLIP path so that it takes precedence over the installed package.

export PYTHONPATH=$PYTHONPATH:"your_path/HOLa/CLIP"

So that you can use the local clip without uninstall the clip of your python env.

  1. modify the installed pocket library as mentioned here

Scripts

Train / Test on HICO-DET:

Using vit-B image backbone:

bash scripts/hico_vitB.sh

Using vit-L image backbone:

bash scripts/hico_vitL.sh

Train / Test on V-COCO:

Using vit-L image backbone:

bash scripts/vcoco.sh

Model Zoo

DatasetSettingBackbonemAPUnseenSeen
HICO-DETUVResNet-50+ViT-B34.0927.9135.09
HICO-DETRFResNet-50+ViT-B34.1930.6135.08
HICO-DETNFResNet-50+ViT-B32.3635.2531.64
HICO-DETUOResNet-50+ViT-B33.5936.4533.02
DatasetSettingBackbonemAPRareNon-rare
HICO-DETdefaultResNet-50+ViT-B35.4134.3535.73
HICO-DETdefaultResNet-50+ViT-L39.0538.6639.17

You can download our pretrained model checkpoints using the following link from Google Drive:

https://drive.google.com/drive/folders/1kH-yOi-YqdB35rSgKoRkmg_pGbyFEkUX?usp=sharing

Or using the following Kuake link:

Link: https://pan.quark.cn/s/c3f30b122ed2 
Extraction code: yawa

Or downloading in the Huggingface repo:

https://huggingface.co/ChelsieNUS/hola

Citation

If you find our paper and/or code helpful, please consider citing :

@inproceedings{
lei2025hola,
title={HOLa: Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation},
author={Lei, Qinqian and Wang, Bo and Robby T., Tan},
booktitle={In Proceedings of the IEEE/CVF international conference on computer vision},
year={2025}
}

Acknowledgement

We gratefully thank the authors from UPT and ADA-CM for open-sourcing their code.