[TNNLS2024]Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation

April 26, 2025 · View on GitHub

Updates

[2025-04-26] The training code is now available.
[2024-06-20] Our paper is accepted at TNNLS 2024.
[2024-04-16] Initialize the repository, release the test code and the trained model.

Demo

demo1 demo2 demo2

Get Started

Environment

python == 3.8.15
torch == 1.10.0
torchvision == 0.11.0
cuda == 11.4
opencv == 4.6.0

Datasets

Please download the following datasets:

UVOS datasets:

YouTube-VOS: YouTube-VOS
DAVIS: DAVIS
YouTube-Objects: YouTube-Objects
FBMS: FBMS
LongVideos: LongVideos

VSOD datasets:

DAVIS: same as UVOS.
DAVSOD: DAVSOD
SegTrack-V2: SegTrack-V2
ViSal: ViSal

To quickly reproduce our results, we upload the processed data to Google Drive and Baidu Disk (code: qcbh).

Models

stage	model link
pre-train	Google Drive, Baidu Disk (code: qcbh)
fine-tuning	Google Drive, Baidu Disk (code: qcbh)

To reproduct the results we reported in paper, please download the corresponding models and run test script.

Training

Distributed Training.

sh train_m.sh

Single-GPU Training.

sh train_s.sh

Testing

Download the trained MTNet, and placing it in the ./saves.

python test.py [test_model] [task_name] [test_dataset] [output_dir]

Testing for UVOS task:

python test.py --test_model ./saves/mtnet.pth --task_name UVOS --test_dataset DAVIS16 --output_dir output

Testing for VSOD task:

python test.py --test_model ./saves/mtnet.pth --task_name VSOD --test_dataset DAVIS16 --output_dir output

Results

Google Drive

Baidu Disk (code: qcbh)

Evaluation

Evaluation for UVOS results:

python test_scripts/test_for_davis.py --gt_path ../data/DAVIS16/val/mask --result_path output/MTNet/UVOS/DAVIS16/

Evaluation for VSOD results:

python test_scripts/test_vsod/main.py --method MTNet --dataset DAVIS16 --gt_dir test_scripts/test_vsod/gt/ --pred_dir test_scripts/test_vsod/results/

Visualization

Specify the dataset in visualize.py, then run:

python visualize.py

References

This repository owes its existence to the exceptional contributions of other projects:

STCN: https://github.com/hkchengrex/STCN
AOT: https://github.com/yoxu515/aot-benchmark
HFAN: https://github.com/NUST-Machine-Intelligence-Laboratory/HFAN
FSNet: https://github.com/GewelsJI/FSNet
AMCNet: https://github.com/isyangshu/AMC-Net
DAVSOD: https://github.com/DengPingFan/DAVSOD

Many thanks to their invaluable contributions.

BibTeX

@article{zhuge2024learning,
  title={Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation},
  author={Zhuge, Yunzhi and Gu, Hongyu and Zhang, Lu and Qi, Jinqing and Lu, Huchuan},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2024},
  publisher={IEEE}
}