[NeurIPS 2024] EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection
March 21, 2025 ยท View on GitHub
Paper Links
Dataset
Follow the process of UPT.
The downloaded files should be placed as follows. Otherwise, please replace the default path to your custom locations.
|- EZ-HOI
| |- hicodet
| | |- hico_20160224_det
| | |- annotations
| | |- images
| |- vcoco
| | |- mscoco2014
| | |- train2014
| | |-val2014
: :
Dependencies
Reminder:
If you have already installed the clip package in your Python environment (e.g., via pip install clip), please ensure that you use the local CLIP directory provided in our EZ-HOI repository instead. To do this, set the PYTHONPATH to include the local CLIP path so that it takes precedence over the installed package.
export PYTHONPATH=$PYTHONPATH:"your_path/EZ-HOI/CLIP"
So that you can use the local clip without uninstall the clip of your python env.
- run the python file to obtain the pre-extracted CLIP image features
python CLIP_hicodet_extract.py
Remember to make sure the correct path for annotation files and datasets.
|- EZ-HOI
| |- hicodet_pkl_files
| | |- clip336_img_hicodet_test
| | |- clip336_img_hicodet_train
| | |- clipbase_img_hicodet_test
| | |- clipbase_img_hicodet_train
| |- vcoco_pkl_files
| | |- clip336_img_vcoco_train
| | |- clip336_img_vcoco_val
: :
HICO-DET
Train on HICO-DET:
bash scripts/hico_train_vitB_zs.sh
Test on HICO-DET:
bash scripts/hico_test_vitB_zs.sh
Model Zoo
| Dataset | Setting | Backbone | mAP | Unseen | Seen |
|---|---|---|---|---|---|
| HICO-DET | UV | ResNet-50+ViT-B | 32.32 | 25.10 | 33.49 |
| HICO-DET | UV | ResNet-50+ViT-L | 36.84 | 28.82 | 38.15 |
| HICO-DET | RF | ResNet-50+ViT-B | 33.13 | 29.02 | 34.15 |
| HICO-DET | RF | ResNet-50+ViT-L | 36.73 | 34.24 | 37.35 |
| HICO-DET | NF | ResNet-50+ViT-B | 31.17 | 33.66 | 30.55 |
| HICO-DET | NF | ResNet-50+ViT-L | 34.84 | 36.33 | 34.47 |
| HICO-DET | UO | ResNet-50+ViT-B | 32.27 | 33.28 | 32.06 |
| HICO-DET | UO | ResNet-50+ViT-L | 36.38 | 38.17 | 36.02 |
| Dataset | Setting | Backbone | mAP | Rare | Non-rare |
|---|---|---|---|---|---|
| HICO-DET | default | ResNet-50+ViT-L | 38.61 | 37.70 | 38.89 |
Citation
If you find our paper and/or code helpful, please consider citing :
@inproceedings{
lei2024efficient,
title={EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection},
author={Lei, Qinqian and Wang, Bo and Robby T., Tan},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024}
}
Acknowledgement
We gratefully thank the authors from UPT and ADA-CM for open-sourcing their code.