Self-Supervised Visibility Learning for Novel View Synthesis

June 28, 2021 · View on GitHub

This contains the codes for cross-view geo-localization method described in: Self-Supervised Visibility Learning for Novel View Synthesis, CVPR2021. alt text

Abstract

We address the problem of novel view synthesis (NVS) from a few sparse source view images. Conventional image-based rendering methods estimate scene geometry and synthesize novel views in two separate steps. However, erroneous geometry estimation will decrease NVS performance as view synthesis highly depends on the quality of estimated scene geometry. In this paper, we propose an end-to-end NVS framework to eliminate the error propagation issue. To be specific, we construct a volume under the target view and design a source-view visibility estimation (SVE) module to determine the visibility of the target-view voxels in each source view. Next, we aggregate the visibility of all source views to achieve a consensus volume. Each voxel in the consensus volume indicates a surface existence probability. Then, we present a soft ray-casting (SRC) mechanism to find the most front surface in the target view (\ie, depth). Specifically, our SRC traverses the consensus volume along viewing rays and then estimates a depth probability distribution. We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth. At last, our network is trained in an end-to-end self-supervised fashion, thus significantly alleviating error accumulation in view synthesis.
Experimental results demonstrate that our method generates novel views in higher quality compared to the state-of-the-art.

Experiment Dataset

We use two existing dataset to do the experiments

Codes

The codes borrow heavily from https://github.com/YoYo000/MVSNet and https://github.com/yhw-yhw/D2HC-RMVSNet.

We use Tensorflow==1.13.1, cuda==10.0.

Please first download the pretrained weights of VGG19 "imagenet-vgg-verydeep-19.mat" here, and put it under the folder of "Perloss". This is for the perceptual loss.

To train the model, please run:

python trainNVS.py --data_root_dir YOURDATAROOTDIRdatasetnameTanksandTemplesviewnumYOUR_DATA_ROOT_DIR --dataset_name TanksandTemples --view_num VIEW_NUM --max_d $MAX_D --max_w 448 --max_h 256

For testing on the Tanks and Temples:

python testNVS.py --data_root_dir YOURDATAROOTDIRdatasetnameTanksandTemplesviewnumYOUR_DATA_ROOT_DIR --dataset_name TanksandTemples --view_num VIEW_NUM --max_d MAXDmaxw448maxh256outputdirMAX_D --max_w 448 --max_h 256 --output_dir YOUR_OUTPUT_DIR --dataset Truck

python testNVS.py --data_root_dir YOURDATAROOTDIRdatasetnameTanksandTemplesviewnumYOUR_DATA_ROOT_DIR --dataset_name TanksandTemples --view_num VIEW_NUM --max_d MAXDmaxw448maxh256outputdirMAX_D --max_w 448 --max_h 256 --output_dir YOUR_OUTPUT_DIR --dataset Train

python testNVS.py --data_root_dir YOURDATAROOTDIRdatasetnameTanksandTemplesviewnumYOUR_DATA_ROOT_DIR --dataset_name TanksandTemples --view_num VIEW_NUM --max_d MAXDmaxw448maxh256outputdirMAX_D --max_w 448 --max_h 256 --output_dir YOUR_OUTPUT_DIR --dataset M60

python testNVS.py --data_root_dir YOURDATAROOTDIRdatasetnameTanksandTemplesviewnumYOUR_DATA_ROOT_DIR --dataset_name TanksandTemples --view_num VIEW_NUM --max_d MAXDmaxw448maxh256outputdirMAX_D --max_w 448 --max_h 256 --output_dir YOUR_OUTPUT_DIR --dataset Playground

For testing on the DTU dataset:

python testNVS.py --data_root_dir YOURDATAROOTDIRdatasetnameDTUviewnumYOUR_DATA_ROOT_DIR --dataset_name DTU --view_num VIEW_NUM --max_d MAXDmaxw320maxh256outputdirMAX_D --max_w 320 --max_h 256 --output_dir YOUR_OUTPUT_DIR

SSIM and PSNR

For evaluation on the Tanks and Temples:

python metrics_tf.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset Truck

python metrics_tf.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset Train

python metrics_tf.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset M60

python metrics_tf.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset Playground

For evaluation on the DTU dataset:

python metrics_tf.py --dataset_name DTU --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 320 --max_h 256 --output_dir $YOUR_OUTPUT_DIR

LIPIS

The codes for LIPIS evaluation are from LIPIS. Please install the package first: pip install lpips we use torch==1.4.0, torchvision==0.5.0

For evaluation on the Tanks and Temples:

python metrics_LIPIS.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset Truck

python metrics_LIPIS.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset Train

python metrics_LIPIS.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset M60

python metrics_LIPIS.py --dataset_name TanksandTemples --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 448 --max_h 256 --output_dir $YOUR_OUTPUT_DIR --dataset Playground

For evaluation on the DTU dataset:

python metrics_LIPIS.py --dataset_name DTU --view_num VIEWNUMmaxdVIEW_NUM --max_d MAX_D --max_w 320 --max_h 256 --output_dir $YOUR_OUTPUT_DIR

Models:

Our trained model is available in here.

Tensorflow 2.0

If you are using tensorflow>=2.0, please refer to here to update the codes.

Publications

This work is published in CVPR 2021.
[Self-Supervised Visibility Learning for Novel View Synthesis]

If you are interested in our work and use our code, we are pleased that you can cite the following publication:
Yujiao Shi, Hongdong Li, Xin Yu. Self-Supervised Visibility Learning for Novel View Synthesis.

@inproceedings{shi2021selfsupervised, title={Self-Supervised Visibility Learning for Novel View Synthesis.*}, author={Yujiao Shi and Hongdong Li and Xin Yu}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, year={2021} }