[NeurIPS 2024] EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection

March 21, 2025 ยท View on GitHub

arXiv Project Page

Dataset

Follow the process of UPT.

The downloaded files should be placed as follows. Otherwise, please replace the default path to your custom locations.

|- EZ-HOI
|   |- hicodet
|   |   |- hico_20160224_det
|   |       |- annotations
|   |       |- images
|   |- vcoco
|   |   |- mscoco2014
|   |       |- train2014
|   |       |-val2014
:   :      

Dependencies

  1. Follow the environment setup in UPT.

  2. Follow the environment setup in ADA-CM.

Reminder: If you have already installed the clip package in your Python environment (e.g., via pip install clip), please ensure that you use the local CLIP directory provided in our EZ-HOI repository instead. To do this, set the PYTHONPATH to include the local CLIP path so that it takes precedence over the installed package.

export PYTHONPATH=$PYTHONPATH:"your_path/EZ-HOI/CLIP"

So that you can use the local clip without uninstall the clip of your python env.

  1. run the python file to obtain the pre-extracted CLIP image features
python CLIP_hicodet_extract.py

Remember to make sure the correct path for annotation files and datasets.

|- EZ-HOI
|   |- hicodet_pkl_files
|   |   |- clip336_img_hicodet_test
|   |   |- clip336_img_hicodet_train
|   |   |- clipbase_img_hicodet_test
|   |   |- clipbase_img_hicodet_train
|   |- vcoco_pkl_files
|   |   |- clip336_img_vcoco_train
|   |   |- clip336_img_vcoco_val
:   :      
  1. modify the installed pocket library as mentioned here

HICO-DET

Train on HICO-DET:

bash scripts/hico_train_vitB_zs.sh

Test on HICO-DET:

bash scripts/hico_test_vitB_zs.sh

Model Zoo

DatasetSettingBackbonemAPUnseenSeen
HICO-DETUVResNet-50+ViT-B32.3225.1033.49
HICO-DETUVResNet-50+ViT-L36.8428.8238.15
HICO-DETRFResNet-50+ViT-B33.1329.0234.15
HICO-DETRFResNet-50+ViT-L36.7334.2437.35
HICO-DETNFResNet-50+ViT-B31.1733.6630.55
HICO-DETNFResNet-50+ViT-L34.8436.3334.47
HICO-DETUOResNet-50+ViT-B32.2733.2832.06
HICO-DETUOResNet-50+ViT-L36.3838.1736.02
DatasetSettingBackbonemAPRareNon-rare
HICO-DETdefaultResNet-50+ViT-L38.6137.7038.89

Citation

If you find our paper and/or code helpful, please consider citing :

@inproceedings{
lei2024efficient,
title={EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection},
author={Lei, Qinqian and Wang, Bo and Robby T., Tan},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024}
}

Acknowledgement

We gratefully thank the authors from UPT and ADA-CM for open-sourcing their code.