pix2pixHD: High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

October 13, 2021 · View on GitHub

Project | Paper | Video (5m) | Previous Implementation | Two Minute Papers Video

License

Imaginaire is released under NVIDIA Software license. For commercial use, please consult researchinquiries@nvidia.com

Software Installation

For installation, please checkout INSTALL.md.

Hardware Requirement

We trained our model using an NVIDIA DGX1 with 8 V100 16GB GPUs, which takes about 10 hours.

Training

pix2pixHD prefers the following data structure.

${TRAINING_DATASET_ROOT_FOLDER}
└───images
    └───0001.jpg
    └───0002.jpg
    └───0003.jpg
    ...
└───seg_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
└───instance_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...

### Training data preparation

- Download [the Cityscapes dataset](https://www.cityscapes-dataset.com/).
Extract images, segmentation masks, and object instance maks. Organize them
based on the above data structure. Please check out the original pix2pixHD repo
for converting the Cityscapes segmentation mask to a more compact format.

- Build the lmdbs
```bash
for f in train val; do
python scripts/build_lmdb.py \
--config configs/projects/pix2pixhd/cityscapes/ampO1.yaml \
--data_root dataset/cityscapes_raw/${f} \
--output_root dataset/cityscapes/${f} \
--overwrite \
--paired
done

Training command

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--config configs/projects/pix2pixhd/cityscapes/ampO1.yaml

Inference

pix2pixHD prefers the following file arrangement for testing.

${TEST_DATASET_ROOT_FOLDER}
└───seg_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
└───instance_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
    ...

Download sample test data by running

python scripts/download_test_data.py --model_name pix2pixhd

python inference.py --single_gpu \
--config configs/projects/pix2pixhd/cityscapes/ampO1.yaml \
--output_dir projects/pix2pixhd/output/cityscapes

The results are stored in projects/pix2pixhd/output/cityscapes

Below we show the expected output images.

Ground truths	Segmentation maps	Synthesized results

Citation

If you use this code for your research, please cite our papers.

@inproceedings{wang2018pix2pixHD,
   title={High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs},
   author={Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Andrew Tao and Jan Kautz and Bryan Catanzaro},  
   booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
   year={2018}
}