CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
November 8, 2022 ยท View on GitHub
This repository contains the codebase for the EMNLP'22 main conference long paper => CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
Dataset preparation
- Download the annotations and question-answer pair files from dataset link.
- Follow the instructions from datasets.md to generate the video instances from the annotations.
- Alternatively, download the per-processed Mask-RCNN based features from feature link.
Aloe*+BERT Model
The Aloe*+BERT is PyTorch version of the modified baseline Aloe from Ding et. al.
- Please refer to the modeling.md for the instructions on training of the Aloe*+BERT.
Evaluation
- Evaluations for the descriptive and counterfactual questions are straightforward.
- For planning based task evaluation, please refer to the evaluations.md for step by step instructions.
Acknowledgement
This work is supported by NSF and DARPA projects. We also thank the David Ding for timely feedback to reproduce the results of PyTorch version of the Aloe on CLEVRER dataset.
Citation
Please consider citing the paper if you find it relevant or useful.
@inproceedings{patel2022cripp,
title = "{CRIPP-VQA}: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering",
author = " Patel, Maitreya and
Gokhale, Tejas and
Baral, Chitta and
Yang, Yezhou",
booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
year = "2022",
}
Issues
For technical concerns please create the GitHub issues. A quick way to resolve any issues would be to reach out to the author at maitreya.patel@asu.edu.