PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility
October 12, 2021 ยท View on GitHub
Jae Yong Lee, Joseph DeGol, Chuhang Zou, Derek Hoiem
Installation
To install necessary python package for our work:
conda install pytorch torchvision numpy matplotlib pandas tqdm tensorboard cudatoolkit=11.1 -c pytorch -c conda-forge
pip install opencv-python tabulate moviepy openpyxl pyntcloud open3d==0.9 pytorch-lightning==1.4.9
To setup dataset for training for our work, please download:
To setup dataset for testing, please use:
- ETH3D High-Res (PatchMatchNet pre-processed sets)
- NOTE: We use our own script to pre-process. We are currently preparing code for the script. We will post update once it is available.
- Tanks and Temples (MVSNet pre-processed sets)
Training
To train out method:
python bin/train.py --experiment_name=EXPERIMENT_NAME \
--log_path=TENSORBOARD_LOG_PATH \
--checkpoint_path=CHECKPOINT_PATH \
--dataset_path=ROOT_PATH_TO_DATA \
--dataset={BlendedMVS,DTU} \
--resume=True # if want to resume training with the same experiment_name
Testing
To test our method, we need two scripts. First script to generate geometetry, and the second script to fuse the geometry. Geometry generation code:
python bin/generate.py --experiment_name=EXPERIMENT_USED_FOR_TRAINING \
--checkpoint_path=CHECKPOINT_PATH \
--epoch_id=EPOCH_ID \
--num_views=NUMBER_OF_VIEWS \
--dataset_path=ROOT_PATH_TO_DATA \
--output_path=PATH_TO_OUTPUT_GEOMETRY \
--width=(optional)WIDTH \
--height=(optional)HEIGHT \
--dataset={ETH3DHR, TanksAndTemples} \
--device=DEVICE
This will generate depths / normals / images into the folder specified by --output_path. To be more precise:
OUTPUT_PATH/
EXPERIMENT_NAME/
CHECKPOINT_FILE_NAME/
SCENE_NAME/
000000_camera.pth <-- contains intrinsics / extrinsics
000000_depth_map.pth
000000_normal_map.pth
000000_meta.pth <-- contains src_image ids
...
Once the geometries are generated, we can use the fusion code to fuse them into point cloud: GPU Fusion code:
python bin/fuse_output.py --output_path=OUTPUT_PATH_USED_IN_GENERATE.py
--experiment_name=EXPERIMENT_NAME \
--epoch_id=EPOCH_ID \
--dataset=DATASET \
# fusion related args
--proj_th=PROJECTION_DISTANCE_THRESHOLD \
--dist_th=DISTANCE_THRESHOLD \
--angle_th=ANGLE_THRESHOLD \
--num_consistent=NUM_CONSITENT_IMAGES \
--target_width=(Optional) target image width for fusion \
--target_height=(Optional) target image height for fusion \
--device=DEVICE \
The target width / height are useful for fusing depth / normal after upsampling.
We also provide ETH3D testing script:
python bin/evaluate_eth3d.py --eth3d_binary_path=PATH_TO_BINARY_EXE \
--eth3d_gt_path=PATH_TO_GT_MLP_FOLDER \
--output_path=PATH_TO_FOLDER_WITH_POINTCLOUDS \
--experiment_name=NAME_OF_EXPERIMENT \
--epoch_id=EPOCH_OF_CHECKPOINT_TO_LOAD (default last.ckpt)
Resources
Citation
If you want to use our work in your project, please cite:
@InProceedings{lee2021patchmatchrl,
author = {Lee, Jae Yong and DeGol, Joseph and Zou, Chuhang and Hoiem, Derek},
title = {PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
month = {October},
year = {2021}
}