pix2pixHD: High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

October 13, 2021 · View on GitHub

Project | Paper | Video (5m) | Previous Implementation | Two Minute Papers Video

teaser

License

Imaginaire is released under NVIDIA Software license. For commercial use, please consult researchinquiries@nvidia.com

Software Installation

For installation, please checkout INSTALL.md.

Hardware Requirement

We trained our model using an NVIDIA DGX1 with 8 V100 16GB GPUs, which takes about 10 hours.

Training

pix2pixHD prefers the following data structure.

${TRAINING_DATASET_ROOT_FOLDER}
└───images
    └───0001.jpg
    └───0002.jpg
    └───0003.jpg
    ...
└───seg_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
└───instance_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...

### Training data preparation

- Download [the Cityscapes dataset](https://www.cityscapes-dataset.com/).
Extract images, segmentation masks, and object instance maks. Organize them
based on the above data structure. Please check out the original pix2pixHD repo
for converting the Cityscapes segmentation mask to a more compact format.

- Build the lmdbs
```bash
for f in train val; do
python scripts/build_lmdb.py \
--config configs/projects/pix2pixhd/cityscapes/ampO1.yaml \
--data_root dataset/cityscapes_raw/${f} \
--output_root dataset/cityscapes/${f} \
--overwrite \
--paired
done

Training command

python -m torch.distributed.launch --nproc_per_node=8 train.py \
--config configs/projects/pix2pixhd/cityscapes/ampO1.yaml

Inference

pix2pixHD prefers the following file arrangement for testing.

${TEST_DATASET_ROOT_FOLDER}
└───seg_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
└───instance_maps
    └───0001.png
    └───0002.png
    └───0003.png
    ...
    ...
  • Download sample test data by running
python scripts/download_test_data.py --model_name pix2pixhd
python inference.py --single_gpu \
--config configs/projects/pix2pixhd/cityscapes/ampO1.yaml \
--output_dir projects/pix2pixhd/output/cityscapes

The results are stored in projects/pix2pixhd/output/cityscapes

Below we show the expected output images.

Ground truths Segmentation maps Synthesized results
gt seg result
gt seg result
gt seg result

Citation

If you use this code for your research, please cite our papers.

@inproceedings{wang2018pix2pixHD,
   title={High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs},
   author={Ting-Chun Wang and Ming-Yu Liu and Jun-Yan Zhu and Andrew Tao and Jan Kautz and Bryan Catanzaro},  
   booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
   year={2018}
}