LifeQA data and code
March 16, 2025 ยท View on GitHub
This repo contains the data and PyTorch code that accompanies our LREC 2020 paper:
LifeQA: A Real-life Dataset for Video Question Answering
Santiago Castro, Mahmoud Azab, Jonathan C. Stroud, Cristina Noujaim, Ruoyao Wang, Jia Deng, Rada Mihalcea
More information is available at the LifeQA website.
Setup
To run it, setup a new environment with Conda and activate it:
conda env create -f environment.yml
conda activate lifeqa
Data
The dataset is under data/, in lqa_train.json, lqa_dev.json,
and lqa_test.json. Even though it's divided into train/dev/test, for most experiments we merge
them and use a five-fold cross-validation, with the folds indicated in data/folds.
Visual features
You can download the already extracted features or do the following to extract them yourself.
-
Download the videos. Due to YouTube's Terms of Service, we can't provide the video files. However, we provide the IDs and timestamps to obtain the same data. Download the YouTube videos indicated in the field
parent_video_idfrom the JSON files, cut them based on the fieldsstart_timeandend_time, and save them based on the JSON key (e.g.,213) todata/videos, placing the files there without subdirectories. -
Run
save_frames.shto extract the frames in the video files:bash feature_extraction/save_frames.sh -
Download pretrained weights from Sports1M for C3D and save it in
data/features/c3d.pickle. -
To extract the features (e.g. from an ImageNet-pretrained ResNet-152) and save them in big H5 files:
mkdir data/features python feature_extraction/extract_features.py resnet
Baselines
Check the scripts under run_scripts to run the available baselines.
TVQA
Running the TVQA baseline is different from running the rest of the baselines.
We copied TVQA's repo content from commit 2c98044
into the TVQA/ folder.
Changes from upstream
It has been changed to support 4 answer choices instead of 5. Some other minor modifications have been done as well.
Setup
-
Convert LifeQA dataset to TVQA format
python scripts/to_tvqa_format.py -
Enter
TVQA/directory:cd TVQA/ -
Setup the interpreter:
conda env create -f environment.yml conda activate tvqa -
Do some pre-processing:
python preprocessing.py --data_dir ../data/tvqa_format for i in 0 1 2 3 4; do python preprocessing.py --data_dir ../data/tvqa_format/fold${i} done mkdir cache_lifeqa python tvqa_dataset.py \ --input_streams sub \ --no_ts \ --vcpt_path ../data/tvqa_format/det_visual_concepts_hq.pickle \ --train_path ../data/tvqa_format/lqa_train_processed.json \ --valid_path ../data/tvqa_format/lqa_dev_processed.json \ --test_path ../data/tvqa_format/lqa_test_processed.json \ --word2idx_path cache_lifeqa/word2idx.pickle \ --idx2word_path cache_lifeqa/idx2word.pickle \ --vocab_embedding_path cache_lifeqa/vocab_embedding.pickle
Train and test on LifeQA dataset from scratch
For 5-fold cross-validation:
for i in 0 1 2 3 4; do
python main.py \
--input_streams sub vcpt \
--no_ts \
--vcpt_path ../data/tvqa_format/det_visual_concepts_hq.pickle \
--train_path ../data/tvqa_format/fold${i}/train_processed.json \
--valid_path ../data/tvqa_format/fold${i}/validation_processed.json \
--test_path ../data/tvqa_format/fold${i}/test_processed.json \
--word2idx_path cache_lifeqa/word2idx.pickle \
--idx2word_path cache_lifeqa/idx2word.pickle \
--vocab_embedding_path cache_lifeqa/vocab_embedding.pickle
python test.py --model_dir $(ls -t results/ | head -1) --mode test
done
For train, dev, and test partitions:
python main.py \
--input_streams sub vcpt \
--no_ts \
--vcpt_path ../data/tvqa_format/det_visual_concepts_hq.pickle \
--train_path ../data/tvqa_format/lqa_train_processed.json \
--valid_path ../data/tvqa_format/lqa_dev_processed.json \
--test_path ../data/tvqa_format/lqa_test_processed.json \
--word2idx_path cache_lifeqa/word2idx.pickle \
--idx2word_path cache_lifeqa/idx2word.pickle \
--vocab_embedding_path cache_lifeqa/vocab_embedding.pickle
python test.py --model_dir $(ls -t results/ | head -1) --mode test
Train on TVQA dataset
python preprocessing.py
mkdir cache_original
python tvqa_dataset.py \
--input_streams sub \
--no_ts \
--word2idx_path cache_original/word2idx.pickle \
--idx2word_path cache_original/idx2word.pickle \
--vocab_embedding_path cache_lifeqa/vocab_embedding.pickle
python main.py \
--input_streams sub vcpt \
--no_ts
RESULTS_FOLDER_NAME=$(ls -t results/ | head -1)
The result from this part was saved in results_2019_05_16_23_02_15 in Google Drive. Note it corresponds to S+V+Q, with cpt as the video feature and w/o ts.
Test on LifeQA dataset
For 5-fold cross-validation:
for i in 0 1 2 3 4; do
python test.py \
--vcpt_path ../data/tvqa_format/det_visual_concepts_hq.pickle \
--test_path ../data/tvqa_format/fold${i}/test_processed.json \
--model_dir "${RESULTS_FOLDER_NAME}" \
--mode test
done
For the test partition:
python test.py \
--vcpt_path ../data/tvqa_format/det_visual_concepts_hq.pickle \
--test_path ../data/tvqa_format/lqa_test_processed.json \
--model_dir "${RESULTS_FOLDER_NAME}" \
--mode test
Fine-tune on LifeQA dataset
For 5-fold cross-validation:
python main.py \
--input_streams sub vcpt \
--no_ts \
--vcpt_path ../data/tvqa_format/det_visual_concepts_hq.pickle \
--train_path ../data/tvqa_format/fold${i}/train_processed.json \
--valid_path ../data/tvqa_format/fold${i}/validation_processed.json \
--test_path ../data/tvqa_format/fold${i}/test_processed.json \
--word2idx_path cache_original/word2idx.pickle \
--idx2word_path cache_original/idx2word.pickle \
--vocab_embedding_path cache_original/vocab_embedding.pickle \
--pretrained_model_dir "${RESULTS_FOLDER_NAME}" \
--new_word2idx_path cache_lifeqa/word2idx.pickle
For train, dev, and test partitions:
python main.py \
--input_streams sub vcpt \
--no_ts \
--vcpt_path ../data/tvqa_format/det_visual_concepts_hq.pickle \
--train_path ../data/tvqa_format/lqa_train_processed.json \
--valid_path ../data/tvqa_format/lqa_dev_processed.json \
--test_path ../data/tvqa_format/lqa_test_processed.json \
--word2idx_path cache_original/word2idx.pickle \
--idx2word_path cache_original/idx2word.pickle \
--vocab_embedding_path cache_original/vocab_embedding.pickle \
--pretrained_model_dir "${RESULTS_FOLDER_NAME}" \
--new_word2idx_path cache_lifeqa/word2idx.pickle
Issues
If you encounter issues while using our data or code, please open an issue in this repo.