MOSE: Complex Video Object Segmentation Dataset
April 14, 2026 ยท View on GitHub
Quick Links
๐ฅ MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes
If you want to test your VOS model's performance in real-world complex scenarios, MOSEv2 is the right choice. Here are some cases from MOSEv2.
- โฌ๏ธ Download Dataset
- ๐ Homepage
- ๐ MOSEv2 Paper (arXiv)
- ๐ Evaluation Server
- ๐ค Baseline Model: SAM2RCMS
- โฌ๏ธ Download Baseline Model
MOSEv1: A New Dataset for Video Object Segmentation in Complex Scenes
News
- [2026/04/01] MOSEv2 is supported by Fiftyone for visualize and downloading. Thanks.
- [2025/08/07] MOSEv2 dataset has been released! ๐ฅ๐๐โจ๐๐๐ซ๐
- [2023/02/09] MOSEv1 dataset has been released!
Download
MOSEv2 Dataset
- ๐ค Hugging Face
- โ๏ธ Baidu Pan (pwd: p2m6)
- โ๏ธ Google Drive
- โ๏ธ OneDrive
MOSEv1 Dataset
- ๐ค Hugging Face
- โ๏ธ OneDrive
- โ๏ธ Google Drive
- โ๏ธ Baidu Pan (pwd: MOSE)
File Structure
The dataset follows a similar structure as DAVIS and Youtube-VOS. The dataset consists of two parts: JPEGImages which holds the frame images, and Annotations which contains the corresponding segmentation masks. The frame images are numbered using five-digit numbers. Annotations are saved in color-pattlate mode PNGs like DAVIS.
Please note that while annotations for all frames in the training set are provided, annotations for the validation set will only include the first frame.
<train/valid.tar>
โ
โโโ Annotations
โ โ
โ โโโ <video_name_1>
โ โ โโโ 00000.png
โ โ โโโ 00001.png
โ โ โโโ ...
โ โ
โ โโโ <video_name_2>
โ โ โโโ 00000.png
โ โ โโโ 00001.png
โ โ โโโ ...
โ โ
โ โโโ <video_name_...>
โ
โโโ JPEGImages
โ
โโโ <video_name_1>
โ โโโ 00000.jpg
โ โโโ 00001.jpg
โ โโโ ...
โ
โโโ <video_name_2>
โ โโโ 00000.jpg
โ โโโ 00001.jpg
โ โโโ ...
โ
โโโ <video_name_...>
BibTeX
Please consider to cite MOSE if it helps your research.
@article{MOSEv2,
title={{MOSEv2}: A More Challenging Dataset for Video Object Segmentation in Complex Scenes},
author={Ding, Henghui and Ying, Kaining and Liu, Chang and He, Shuting and Jiang, Xudong and Jiang, Yu-Gang and Torr, Philip HS and Bai, Song},
journal={arXiv preprint arXiv:2508.05630},
year={2025}
}
@inproceedings{MOSE,
title={{MOSE}: A New Dataset for Video Object Segmentation in Complex Scenes},
author={Ding, Henghui and Liu, Chang and He, Shuting and Jiang, Xudong and Torr, Philip HS and Bai, Song},
booktitle={ICCV},
year={2023}
}
License
MOSE is licensed under a CC BY-NC-SA 4.0 License. The data of MOSE is released for non-commercial research purpose only.