Data

November 15, 2024 · View on GitHub

Data Overview

Data nameJson fileJson_SizeImages_Size
VersaDlist_pretrain.json1.02 GB140 GB
VHM_SFTlist_sft.json1.5 GB124 GB
VHM_Eval--7.35 GB

VersaD Dataset

This dataset is curated from crowdai, cvact, cvusa, fmow, loveda, millionAID, etc, resulting in total 140M high-quality image-text pairs with the help of powerful Gemini-Vision. The dataset is used for the pretraining of VHM.

SFT Dataset

This dataset contains VersaD-Instruct,HnstD and VariousRS-Instruct sub datasets. The images in these datasets come from public datasets such as BANDON,DOIR,DOTA,FBP,METER-ML,MSAR,Mts-WH,NWPU-RESISC45,RSITMD,RSVQA,UCM,crowdAI,deepglobe,fair1M and fmow, the instruction portion is based on their original labels.

VHM_Eval Dataset

This dataset is a collection of all evaluation data in the paper, including Table 5-9 and Table 11. The specific relationship between tasks and corresponding files is as follows:

Task nameQuestion typeJson file
Honst Tasks
Honst-presenceopen-endpresence_mo_dota.json
Honst-object absolute positionmulti-choiceabspos_dota-test_mc.json
Honst-object absolute position-falsemulti-choiceabspos_false_dota-test_adversarial_mc.json
abspos_false_dota-test_popular_mc.json
abspos_false_dota-test_random_mc.json
Honst-object relative positionmulti-choicerelpos_dota-test_mc.json
Honst-object relative position-falsemulti-choicerelpos_false_dota-test_adversarial_mc.json
relpos_false_dota-test_popula_mc.json
relpos_false_dota-test_random_mc.json
Honst-cloloropen-endcolor_dota-test_fair1m-val_open.json
Honst-color-falseopen-endcolor_false_dota-test_random_open.json
color_false_dota-test_popular_open.json
color_false_dota-test_adversarial_open.json
Honst-clolor-pan falseopen-endcolor_false_pan_dota-test.json
VariousRS Tasks
Scene Classificationopen-endcls_WHU_RS19.json
cls_SIRI_WHU.json
cls_NWPU_RESISC45.json
cls_METER_ML.json
cls_AID.json
Building footprint vectorizationopen-endbfv_crowdai_val.json
Countingopen-endcounting_dota-test_open.json
Image Resolutionopen-endgsd_dota_fbp.json
Image Modalitymulti-choiceimgType_mcq.json
Multi-label Classificationopen-endmlc_fbp_test.json
mlc_gid_test.json
Geometric Measurementopen-endobj_meas_dota_test.json
RSVQA-HR*open-endRSVQA_HR-comp_RSVQA.json
RSVQA_HR-presence_RSVQA.json
RSVQA-LR*open-endRSVQA_LR-presence_RSVQA.json
RSVQA_LR-presence_RSVQA.json
RSVQA_LR-rural_urban_RSVQA.json
Visual Groundingopen-endVG_DOIR_RSVG_test.json

* means that the dataset is a randomly sampled subset; you need to download the entire dataset yourself.

Data preparation

Pretrain stage dataset preparation

  1. Please download the VersaD dataset.
  2. Prepare the datasets according to the file structure shown below, where pretrain_base denotes the root directory of the entire pretrain dataset.
{pretrain_base}/
    # image dirs
    crowdai/
        image0.jpg
        image1.jpg
        ...
        imagexx.jpg
    cvusa/
    ...

    # json files
    list_pretrain.json

SFT stage dataset preparation

  1. Please download the VHM_SFT dataset.
  2. Prepare the datasets according to the file structure shown below, where sft_base denotes the root directory of the entire SFT dataset.
{sft_base}/
    # image dirs
    BANDON/
        image0.jpg
        image1.jpg
        ...
        imagexx.jpg
    DOTA-train/
    ...

    # json files
    list_sft.json

Important notice: For the convenience, we provide a zip file for web data. These images must be used for academic purpose.