DepthFlow
October 22, 2025 · View on GitHub
This is the official PyTorch implementation of our paper:
DepthFlow: Exploiting Depth-Flow Structural Correlations for Unsupervised Video Object Segmentation, ICCVW 2025
Suhwan Cho, Minhyeok Lee, Jungho Lee, Donghyeong Kim, Sangyoun Lee
Link: [ICCVW] [arXiv]
You can also find other related papers at awesome-video-object-segmentation.
Abstract
In unsupervised VOS, the scarcity of training data has been a significant bottleneck in achieving high segmentation accuracy. Inspired by observations on two-stream approaches, we introduce a novel data generation method based on the depth-to-flow conversion process. With our flow synthesis protocol, large-scale image-flow-mask triplets can be leveraged during network training. To facilitate future research, we also prepare the DUTSv2 dataset that includes pairs of original images and their corresponding synthetic flow maps.
Setup
1. Download the datasets: DUTS, DAVIS, FBMS, YouTube-Objects, Long-Videos.
2. Estimate and save optical flow maps from the videos using RAFT.
3. For DUTS, simulate optical flow maps using DPT.
4. I also provide the pre-processed datasets: DUTSv2, DAVIS, FBMS, YouTube-Objects, Long-Videos.
Running
Training
Start DepthFlow training with:
python run.py --train
Verify the following before running:
✅ Training dataset selection and configuration
✅ GPU availability and configuration
✅ Backbone network selection
Testing
Run DepthFlow with:
python run.py --test
Verify the following before running:
✅ Testing dataset selection
✅ GPU availability and configuration
✅ Backbone network selection
✅ Pre-trained model path
Attachments
Pre-trained model (mitb0)
Pre-trained model (mitb1)
Pre-trained model (mitb2)
Pre-computed results
Contact
Code and models are only available for non-commercial research purposes.
For questions or inquiries, feel free to contact:
E-mail: suhwanx@gmail.com