Global Inference Network- pytorch

January 2, 2020 · View on GitHub

This is the  implementation of the paper: A Real-time Global Inference Network for One-stage Referring Expression Comprehension.

Our code is based on ZSGNet. We further add two modules, i.e. the Adaptive Feature Selection and the Global Attentive ReAsoNing unit with an attention loss. Besides, we release all pretrained models and datasets used in our paper .

Note that the preparations of this code are following the setting of ZSGNet. If you have any problems, please contact with us.

Training

Basic usage is python code/main_dist.py "experiment_name" --arg1=val1 --arg2=val2 and the arg1, arg2 can be found in configs/cfg.yaml. This trains using the DataParallel mode.

For distributed learning use python -m torch.distributed.launch --nproc_per_node=$ngpus code/main_dist.py instead. This trains using the DistributedDataParallel mode. (Also see caveat in using distributed training below)

An example to train on ReferIt dataset (note you must have prepared referit dataset) would be:

python code/main_dist.py "referit_try" --ds_to_use='refclef' --bs=16 --nw=4

Similarly for distributed learning (need to set npgus as the number of gpus)

python -m torch.distributed.launch --nproc_per_node=$npgus code/main_dist.py "referit_try" --ds_to_use='refclef' --bs=16 --nw=4

Evaluation

There are two ways to evaluate.

  1. For validation, it is already computed in the training loop. If you just want to evaluate on validation or testing on a model trained previously ($exp_name) you can do:
python code/main_dist.py $exp_name --ds_to_use='refclef' --resume=True --only_val=True --only_test=True

or you can use a different experiment name as well and pass --resume_path argument like:

python code/main_dist.py $exp_name --ds_to_use='refclef' --resume=True --resume_path='./tmp/models/referit_try.pth' 

After this, the logs would be available inside tmp/txt_logs/$exp_name.txt

  1. If you have some other model, you can output the predictions in the following structure into a pickle file say predictions.pkl:
[
    {'id': annotation_id,
 	'pred_boxes': [x1,y1,x2,y2]},
    .
    .
    .
]

Then you can evaluate using code/eval_script.py using:

python code/eval_script.py predictions_file gt_file

For referit it would be

python code/eval_script.py ./tmp/predictions/$exp_name/val_preds_$exp_name.pkl ./data/referit/csv_dir/val.csv

Datasets

DatasetLink
Flickr30kOne Drive
ReferitOne Drive
Flickr-Split-0One Drive
Flickr-Split-1One Drive
VG-2B,2UB,3B,3UBOne Drive
RefCOCO,RefCOCO+,RefCOCOgcoming soon!

Pre-trained Models

we tried to repeat the results of ZSGNet. But unfortunately, the results are a bit different from the paper, especially in referit, where the results are slightly better in our experiences.

ModelDatasetvaltestlink
ZSGNetflickr30K63.1563.43One Drive
GIN(10 epochs)flickr30K64.0664.77One Drive
GIN(10 epochs+ resized_10_epochs)flickr30K66.5468.14
ZSGNetreferit65.9962.73One Drive
GIN(10 epochs)referit68.4065.15One Drive
GIN(10 epochs+ resized_10_epochs)referitcoming sooncoming soon

Citation

If you find the code useful, please cite us:

@article{zhou2019a,
title={A Real-time Global Inference Network for One-stage Referring Expression Comprehension.},
author={Zhou, Yiyi and Ji, Rongrong and Luo, Gen and Sun, Xiaoshuai and Su, Jinsong and Ding, Xinghao and Lin, Chiawen and Tian, Qi},
journal={arXiv: Computer Vision and Pattern Recognition},
year={2019}}