DepthFlow

October 22, 2025 · View on GitHub

This is the official PyTorch implementation of our paper:

DepthFlow: Exploiting Depth-Flow Structural Correlations for Unsupervised Video Object Segmentation, ICCVW 2025
Suhwan Cho, Minhyeok Lee, Jungho Lee, Donghyeong Kim, Sangyoun Lee
Link: [ICCVW] [arXiv]

You can also find other related papers at awesome-video-object-segmentation.

Abstract

In unsupervised VOS, the scarcity of training data has been a significant bottleneck in achieving high segmentation accuracy. Inspired by observations on two-stream approaches, we introduce a novel data generation method based on the depth-to-flow conversion process. With our flow synthesis protocol, large-scale image-flow-mask triplets can be leveraged during network training. To facilitate future research, we also prepare the DUTSv2 dataset that includes pairs of original images and their corresponding synthetic flow maps.

Setup

1. Download the datasets: DUTS, DAVIS, FBMS, YouTube-Objects, Long-Videos.

2. Estimate and save optical flow maps from the videos using RAFT.

3. For DUTS, simulate optical flow maps using DPT.

4. I also provide the pre-processed datasets: DUTSv2, DAVIS, FBMS, YouTube-Objects, Long-Videos.

Running

Training

Start DepthFlow training with:

python run.py --train

Verify the following before running:
✅ Training dataset selection and configuration
✅ GPU availability and configuration
✅ Backbone network selection

Testing

Run DepthFlow with:

python run.py --test

Verify the following before running:
✅ Testing dataset selection
✅ GPU availability and configuration
✅ Backbone network selection
✅ Pre-trained model path

Attachments

Pre-trained model (mitb0)
Pre-trained model (mitb1)
Pre-trained model (mitb2)
Pre-computed results

Contact

Code and models are only available for non-commercial research purposes.
For questions or inquiries, feel free to contact:

E-mail: suhwanx@gmail.com