Segmenting Moving Objects via an Object-Centric Layered Representation
March 1, 2023 ยท View on GitHub
Junyu Xie, Weidi Xie, Andrew Zisserman
Visual Geometry Group, Department of Engineering Science, University of Oxford
In NeurIPS, 2022.
[arXiv] [PDF] [Project Page] [Poster]
Requirements
python=3.8.8,
pytorch=1.9.1,
Pillow,
opencv,
einops (for tensor manipulation),
tensorboardX (for data logging)
Dataset preparation
- DAVIS2016 can be used directly after download.
- For DAVIS2017-motion, RGB sequences are the same as those within the DAVIS2017 dataset. The curated annotations can be downloaded from here.
- Other datasets such as SegTrackv2, FBMS-59 and MoCA_filter are required to be preprocessed. We follow the same preprocessing protocol in motiongrouping.
- Synthetic datasets (Syn-Train and Syn-Val) can be downloaded from here. (Modal annotations are not provided, as they can be generated from amodal annotations during dataloading).
Optical flows are estimated by RAFT method. Flow estimation codes are also provided in flow folder.
Once finished, in config.py, modify dataset paths in setup_dataset and set corresponding logging paths in setup_path.
To setup your own data:
- Add you own dataset information in
setup_datasetinconfig.py. - Add you dataset name to the choices in
parser.add_argument('--dataset')intrain.pyandeval.py - Add colour palette information for input/output annotations to
data/colour_palette.json
Training
python train.py --queries 3 --gaps 1,-1 --batch_size 2 --frames 30 --dataset Syn
The flow-only OCLR model pretrained on our synthetic dataset (Syn-train) can be downloaded from here.
Inference
python eval.py --queries 3 --gaps 1,-1 --batch_size 1 --frames 30 --dataset DAVIS17m \
--resume_path /path/to/ckpt --save_path /path/to/savepath
where --resume_path indicates the checkpoint path, and --save_path corresponds to the saving path of segmentation results.
Our segmentation results on several datasets (DAVIS2016, DAVIS2017-motion, SegTrackv2, FBMS-59, MoCA) can be also found here.
Evaluation benchmarks:
- For DAVIS2016, use the DAVIS2016 official evaluator.
- For DAVIS2017-motion, once our curated annotations are downloaded from here, simply replace
Annotations_unsupervisedfolder in the DAVIS2017 dataset. Then, DAVIS2017 official evaluator can be used to evaluate the unsupervised VOS performance. - For MoCA, use the evaluator provided in motiongrouping.
Test-time adaptation
The test-time adaptation process refines flow-predicted masks by a RGB-based mask propagation process based on DINO features. More information can be found in dino folder.
Citation
If you find the code helpful in your research, please consider citing our work:
@inproceedings{xie2022segmenting,
title = {Segmenting Moving Objects via an Object-Centric Layered Representation},
author = {Junyu Xie and Weidi Xie and Andrew Zisserman},
booktitle = {Advances in Neural Information Processing Systems},
year = {2022}
}