Stanford Filtered data (VG150)
July 16, 2020 ยท View on GitHub
Adapted from Danfei Xu.
Follow the steps to get the dataset set up.
-
Download the VG images part1 part2. Extract these images to a folder and link to them in
config.py(eg. currently I haveVG_IMAGES=data/stanford_filtered/Images, and the extractedVG_100KandVG_100K_2are in this folder). -
Download the VG metadata. I recommend extracting it to this directory (e.g.
data/stanford_filtered/image_data.json), or you can edit the path inconfig.py. -
Download the scene graphs and extract them to
data/stanford_filtered/VG-SGG.h5. -
Download the scene graph dataset metadata and extract it to
data/stanford_filtered/VG-SGG-dicts.json. -
(Optional) The saliency map: We use DSS to generate the saliency map. Please refer to the DSS and follow their setup. We provide the script to use it, see the
data/stanford_filtered/saliencymap.py. In this script, we use theimdb_512.h5as the input image dataset. You can also load the images directly with opencv or PIL, etc.
VG200 and VG-KR
-
Download the VG200 and VG-KR annotation. It contains two files: VG200-SGG-dicts.json and VG200-SGG.h5. In the VG200-SGG.h5, there exist indicative key relation annotations. You can obtain them on Google Drive or Baidu (code: kapn).
-
Create a folder
data/vg200and Setup the paths invg200/utils/config.py. You may use the soft links to put theImagesandsaliency_512.h5in this folder. -
(Optional) You can also create the VG200/VG-KR yourself. We provide the scripts and raw data. We briefly list the necessary data here. You can refer to
data/vg200/utils/config.pyand properly set the paths. Before running the scripts, remember to fix your PYTHONPATH:export PYTHONPATH=/home/YourName/ThePathOfYourProject. All the scripts should be run from the project root.-
Prepare the additional raw VG data (all of them can be found on the Visual Genome site), including:
-
imdb_1024.h5, imdb_512.h5 (you can also use the raw images).
-
object_alias.txt, relationship_alias.txt.
-
objects.json, relationships.json.
-
-
Prepare the word embedding vectors from GloVe. Put the data files under the folder
data/GloVe. -
Download our provided raw data directly(Baidu (code: 8wz4)), OR, run the script to generate them yourself:
-
captions_to_sg.json. We use the Stanford Scene Graph Parser to generate it. Please refer to their project site.
-
cleanse_objects.jsonO, cleanse_relationships.json. OR, run the
cleanse_raw_vg.pyscript. -
cleanse_triplet_match.json. OR, run the
triplet_match.pyscript. -
object_list.txt, predicate_list.txt, predicate_stem.txt.
-
-
Run the
vg_to_roidb.pyscriplt. It finally creates the VG200-SGG-dicts.json and VG200-SGG.h5.
-