Diagnosing Human-object Interaction Detectors
August 17, 2023 · View on GitHub
Code for "Diagnosing Human-object Interaction Detectors".
Contributed by Fangrui Zhu, Yiming Xie, Weidi Xie, Huaizu Jiang.
Installation
Installl the dependencies.
pip install -r requirements.txt
Data preparation
HICO-DET
Due to the inconsistency in the current evaluation protocol, we remove all no interaction annotations from ground truth annotations and predictions.
There are two precomputed data files regarding HICO-DET.
data
└─ hicodet
|─ sum_gts_filtered.pkl # dict: {HOI triplet: number of it in the ground truth} (test set)
└─ gt_wo_nointer.pkl # ground truth HOIs for all test images
They can be easily obtained from current HOI benchmarks or you can download them here and put it under data/ folder.
V-COCO
Data files can be obtained from the original repo here. Or you can download files for evaluation here and put it under data/ folder.
data
└─ vcoco
|─ vcoco_test.json
|─ instances_vcoco_all_2014.json
└─ vcoco_test.ids
Usage
Standard Evaluation
Run commands below to get standard results without no interaction class. Need to specify --preds_file in the config. This file is the output of the HOI detector, whose format can be found in the examples data/CDN/preds_wo_nointer.pkl and data/CDN/vcoco_r50.pickle. Or you can follow CDN to save this kind of output.
sh eval_hicodet.sh # evaluate on HICO-DET
sh eval_vcoco.sh # evaluate on V-COCO
mAP Improvement
Simply run the following commands to obtain the new mAP after fixing one type of error. You can specify --fix_type and --model_name while changing the error type and HOI model.
sh map_hicodet.sh
sh map_vcoco.sh
Pair Localization
In order to diagnose the model's performance on pair localization, we need to extract intermediate~(the first stage) results from HOI detectors, i.e. extracting detected pairs before they are passed for interaction classification. Example results can be found here (data/CDN/hicodet_pair_preds.pkl).
Then run commands below to obtain the average number of pairs, recall and precision regarding detected pairs.
sh pair_loc_hicodet.sh
sh pair_loc_vcoco.sh
Action Analysis

Binary Classification for "no_interaction" class
We compute of classifying negative human-object pairs to see if the model is able to give low confidence scores to incorrect localized human-object pairs.
We compute the no interaction on HICO-DET as it contains no interaction annotations. To get the output from HOI detectors, we need to save the classification score belonging to the negative class as $1 - \max_i (p_i)p_ii$-th actual action class. Examples can be found here (data/CDN/pair_pred_wscore.pkl).
With the desired model output, we can obtain the of this binary classification problem by running commands below.
sh binary_cls_hicodet.sh
Action mAP on actual classes
For correctly localized pairs, there may be multiple action labels associated with them. We therefore compute the score for the classification over all actual action categories.
In order to get the correct result, one need to save the action score for each predicted triplet, rather than the overall score (human_score * object_score * action_score). Then, running commands below.
sh action_map_hicodet.sh
sh action_map_vcoco.sh