Learning Depth from Monocular Videos using Direct Methods

June 19, 2018 · View on GitHub

Implementation of the methods in "Learning Depth from Monocular Videos using Direct Methods". If you find this code useful, please cite our paper:

@InProceedings{Wang_2018_CVPR,
author = {Wang, Chaoyang and Miguel Buenaposada, José and Zhu, Rui and Lucey, Simon},
title = {Learning Depth From Monocular Videos Using Direct Methods},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Dependencies

  • Python 3.6

  • PyTorch 0.3.1 (latter or eariler version of Pytorch is non-compatible.)

  • visdom, dominate

Training

data preparation

We refer "SfMLeaner" to prepare the training data from KITTI. We assume the processed data is put in directory "./data_kitti/".

training with different pose prediction modules

Start visdom server before for inspecting learning progress before starting the training process.

python -m visdom.server -port 8009
  1. train from scratch with PoseNet

bash run_train_posenet.sh

see run_train_posenet.sh for details.

  1. finetune with DDVO

Use pretrained posenet to give initialization for DDVO. Corresponds to the results reported as "PoseNet+DDVO" in the paper.

bash run_train_finetune.sh

see run_train_finetune.sh for details.

Testing

CUDA_VISIBLE_DEVICES=0 nice -10 python src/testKITTI.py --dataset_root $DATAROOT --ckpt_file $CKPT --output_path $OUTPUT --test_file_list test_files_eigen.txt

Evaluation

We again refer to "SfMLeaner" for their evaluation code.

Acknowledgement

Part of the code structure is borrowed from "Pytorch CycleGAN"