Data Preparation

June 24, 2022 · View on GitHub

1. SSOD: Semi-Supervised Object Detection

We support 5 popular settings in SSOD research as listed below:

Labeled DataUnlabeled DataTest Data
COCO2017-train-1%COCO2017-train-99%COCO2017-test
COCO2017-train-5%COCO2017-train-95%COCO2017-test
COCO2017-train-10%COCO2017-train-90%COCO2017-test
COCO2017-trainCOCO2017-unlabeledCOCO2017-test
VOC07-trainvalVOC12-trainvalVOC07-test
  1. Download VOC and COCO from the website and organize them as follows:

    # ====coco====                      |          # ====voc====
    /data/coco/                         |          /data/voc/
      - images                          |            - 12
        - train2017                     |     		    - VOCdevkit
        - unlabeled2017                 |                - VOC2012
    	- ...                           |                  - ...      
      - annotations                     |            - 07
    	- instances_train2017.json      |              - VOCdevkit
    	- image_info_unlabeled2017.json |                - VOC2007
    	- ...						    |                  - ...						
    
  2. Run scripts to create the soft symlink:

    # * please change the "prefix_coco", "prefix_coco_ul", "prefix_voc" in the scripts to fit your environment.
    # * you can also create symlink by yourself.
    cd tools/datasets
    xonsh create_dataset_link.sh
    
  3. Create coco-standard, coco-additional, voc (it will cost several minutes):

    cd tools/datasets
    xonsh preprocess_dataset.sh
    

2. DAOD: Domain Adaptive Object Detection

We support 4 popular settings in DAOD research as listed below:

Labeled DataUnlabeled DataTest Data
normal\tofoggy (C2F)cityscapes (train)cityscapes-foggy (train)cityscapes-foggy (val)
small\tolarge (C2B)cityscapes (train)BDD100K (train)BDD100K (val)
across cameras (K2C)KITTI (train)cityscapes (train)cityscapes (val)
synthetic\toreal (S2C)Sim10Kcityscapes (train)cityscapes (val)
  1. Download cityscapes, cityscapes-foggy, KITTI, Sim10K and BDD100K from the website and organize them as follows:

    # cityscapes          |    # cityscapes-foggy      |   # BDD
    /data/city            |    /data/foggycity         |   /data/BDD
      - VOC2007_citytrain |      - VOC2007_foggytrain  |     - VOC2007_bddtrain
        - ImageSets       |        - ImageSets         |       - ImageSets
        - JPEGImages      |        - JPEGImages        |       - JPEGImages
        - Annotations     |        - Annotations       |       - Annotations 
      - VOC2007_cityval   |      - VOC2007_foggyval    |     - VOC2007_bddval 
        - ImageSets       |        - ImageSets         |       - ImageSets
        - JPEGImages      |        - JPEGImages        |       - JPEGImages
        - Annotations     |        - Annotations       |       - Annotations 
    # =========================================================================
    # KITTI               |   # Sim10K
    /data/kitti           |   /data/sim
       - ImageSets        |     - ImageSets
       - JPEGImages       |     - JPEGImages
       - Annotations      |     - Annotations
    

    PS: please refer to ProbabilisticTeacher for the detailed dataset pre-processing.

  2. Run scripts to create the soft symlink:

    cd tools/datasets_uda
    xonsh create_dataset_link.sh
    
  3. Convert to coco format:

    cd tools/datasets_uda
    xonsh preprocess_dataset.sh