SPADE/GauGAN: Semantic Image Synthesis with Spatially-Adaptive Normalization

October 13, 2021 · View on GitHub

Project | Paper | GTC Video (2m) | Video (2m) | Demo | Previous Implementation | Two Minute Papers Video

teaser

License

Imaginaire is released under NVIDIA Software license. For commercial use, please consult researchinquiries@nvidia.com

Software Installation

For installation, please checkout INSTALL.md.

Hardware Requirement

We trained our model using an NVIDIA DGX1 with 8 V100 32GB GPUs. Training took about 2-3 week.

Training

SPADE prefers the following data structure.

${TRAINING_DATASET_ROOT_FOLDER}
└───images
    └───0001.jpg
    └───0002.jpg
    └───0003.jpg
    ...
└───seg_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
└───edge_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...

Training data preparation

  • Download COCO training images, COCO validation images , and [Stuff-Things map](http://calvin.inf.ed.ac .uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip) the dataset and unzip the files. Extract images, segmentation masks, and object boundaries for the edge maps. Organize them based on the above data structure.

  • Build the lmdbs

for f in train val; do
python scripts/build_lmdb.py \
--config configs/projects/spade/cocostuff/base128_bs4.yaml \
--data_root dataset/cocostuff_raw/${f} \
--output_root dataset/cocostuff/${f} \
--overwrite \
--paired
done

Training command

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--config configs/projects/spade/cocostuff/base128_bs4.yaml \
--logdir logs/projects/spade/cocostuff/base128_bs4.yaml

Inference

SPADE prefers the following file arrangement for testing.

${TEST_DATASET_ROOT_FOLDER}
└───seg_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
└───edge_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
  • Download sample test data by running
python scripts/download_test_data.py --model_name spade
python -m torch.distributed.launch --nproc_per_node=1 inference.py \
--config configs/projects/spade/cocostuff/base128_bs4.yaml \
--output_dir projects/spade/output/cocostuff

The results are stored in projects/spade/output/cocostuff

Below we show the expected output images.

Ground truth Segmentation Edge Synthesis result
gt seg edge result
gt seg edge result
gt seg edge result

Citation

If you use this code for your research, please cite our papers.

@inproceedings{park2019SPADE,
  title={Semantic Image Synthesis with Spatially-Adaptive Normalization},
  author={Park, Taesung and Liu, Ming-Yu and Wang, Ting-Chun and Zhu, Jun-Yan},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}