Cross-Modal Contrastive Learning for Robust Reasoning in VQA
November 22, 2022 ยท View on GitHub
This repo is an implementation upon METER backbone with PyTorch Lightning. Here is an implementation in PyTorch.
Data preparation and pretrained models
Please follow METER and ViLT to prepare the datasets and download the pretrained checkpoints released by METER. Modify data_root and log_dir in config.py.
Finetune on VQA data
train
python run.py with num_gpus=1 \
num_nodes=1 \
task_finetune_vqa_clip_bert \
per_gpu_batchsize=8 \
load_path=result/official_released/meter_clip16_288_roberta_pretrain.ckpt \
clip16 text_roberta \
image_size=224 \
nce=True \
test_only=False \
seed=0 \
exp_name=finetune_vqa_cmcl
test
python run.py with num_gpus=1 \
num_nodes=1 \
task_finetune_vqa_clip_bert \
per_gpu_batchsize=8 \
load_path=path/to/finetuned/ckpt \
clip16 text_roberta \
image_size=224 \
nce=True \
test_only=True \
seed=0 \
exp_name=finetune_vqa_cmcl