CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering

November 8, 2022 · View on GitHub

This repository contains the codebase for the EMNLP'22 main conference long paper => CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering

Demo | Arxiv | Dataset

Dataset preparation

Download the annotations and question-answer pair files from dataset link.
Follow the instructions from datasets.md to generate the video instances from the annotations.
Alternatively, download the per-processed Mask-RCNN based features from feature link.

Aloe*+BERT Model

The Aloe*+BERT is PyTorch version of the modified baseline Aloe from Ding et. al.

Please refer to the modeling.md for the instructions on training of the Aloe*+BERT.

Evaluation

Evaluations for the descriptive and counterfactual questions are straightforward.
For planning based task evaluation, please refer to the evaluations.md for step by step instructions.

Acknowledgement

This work is supported by NSF and DARPA projects. We also thank the David Ding for timely feedback to reproduce the results of PyTorch version of the Aloe on CLEVRER dataset.

Citation

Please consider citing the paper if you find it relevant or useful.

@inproceedings{patel2022cripp,
    title = "{CRIPP-VQA}: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering",
    author = " Patel, Maitreya and 
        Gokhale, Tejas and 
        Baral, Chitta and
        Yang, Yezhou",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
    year = "2022",
}

Issues

For technical concerns please create the GitHub issues. A quick way to resolve any issues would be to reach out to the author at maitreya.patel@asu.edu.