Preparing Datasets for InstructSeg
December 20, 2024 ยท View on GitHub
Expected dataset structure for COCO:
coco/
train2014/
# image files
Expected dataset structure for RES:
RES/
refcoco/
refcoco_train.json
refcoco_val.json
refcoco_testA.json
refcoco_testB.json
refcoco+/
refcoco+_train.json
refcoco+_val.json
refcoco+_testA.json
refcoco+_testB.json
refcocog/
refcocog_train.json
refcocog_val.json
refcocog_test.json
Expected dataset structure for ReasonSeg:
ReasonSeg/
train/
image_1.jpg, image_1.json
image_2.jpg, image_2.json
val/
image_1.jpg, image_1.json
image_2.jpg, image_2.json
Expected dataset structure for R-VOS, and corresponding json:
rvos/
DAVIS/
train/
JPEGImages
valid/
JPEGImages
refdavis_valid.json
YouTube/
train/
JPEGImages
refyoutube_train.json
valid/
JPEGImages
refyoutube_valid.json
Expected dataset structure for ReVOS:
ReVOS/
JPEGImages
<video1 >
<video2 >
<video...>
mask_dict.json
mask_dict_foreground.json
meta_expressions_train_.json
meta_expressions_valid_.json
Dataset preparation for LLaVA-1.5 training data:
llava_dataset/
gqa/
images/
ocr_vqa/
images/
textvqa/
train_images/
vg/
VG_100K/
VG_100K_2/
llava_v1_5_mix665k.json