MOSE: Complex Video Object Segmentation Dataset

April 14, 2026 · View on GitHub

Quick Links

🔥 MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes

If you want to test your VOS model's performance in real-world complex scenarios, MOSEv2 is the right choice. Here are some cases from MOSEv2.

MOSEv1: A New Dataset for Video Object Segmentation in Complex Scenes

News

[2026/04/01] MOSEv2 is supported by Fiftyone for visualize and downloading. Thanks.
[2025/08/07] MOSEv2 dataset has been released! 🔥🎉🚀✨🎊🌟💫🎈
[2023/02/09] MOSEv1 dataset has been released!

Download

MOSEv2 Dataset

🤗 Hugging Face
☁️ Baidu Pan (pwd: p2m6)
☁️ Google Drive
☁️ OneDrive

MOSEv1 Dataset

🤗 Hugging Face
☁️ OneDrive
☁️ Google Drive
☁️ Baidu Pan (pwd: MOSE)

The dataset follows a similar structure as DAVIS and Youtube-VOS. The dataset consists of two parts: JPEGImages which holds the frame images, and Annotations which contains the corresponding segmentation masks. The frame images are numbered using five-digit numbers. Annotations are saved in color-pattlate mode PNGs like DAVIS.

Please note that while annotations for all frames in the training set are provided, annotations for the validation set will only include the first frame.

<train/valid.tar>
│
├── Annotations
│ │ 
│ ├── <video_name_1>
│ │ ├── 00000.png
│ │ ├── 00001.png
│ │ └── ...
│ │ 
│ ├── <video_name_2>
│ │ ├── 00000.png
│ │ ├── 00001.png
│ │ └── ...
│ │ 
│ ├── <video_name_...>
│ 
└── JPEGImages
  │ 
  ├── <video_name_1>
  │ ├── 00000.jpg
  │ ├── 00001.jpg
  │ └── ...
  │ 
  ├── <video_name_2>
  │ ├── 00000.jpg
  │ ├── 00001.jpg
  │ └── ...
  │ 
  └── <video_name_...>

BibTeX

Please consider to cite MOSE if it helps your research.

@article{MOSEv2,
    title={{MOSEv2}: A More Challenging Dataset for Video Object Segmentation in Complex Scenes},
    author={Ding, Henghui and Ying, Kaining and Liu, Chang and He, Shuting and Jiang, Xudong and Jiang, Yu-Gang and Torr, Philip HS and Bai, Song},
    journal={arXiv preprint arXiv:2508.05630},
    year={2025}
}

@inproceedings{MOSE,
  title={{MOSE}: A New Dataset for Video Object Segmentation in Complex Scenes},
  author={Ding, Henghui and Liu, Chang and He, Shuting and Jiang, Xudong and Torr, Philip HS and Bai, Song},
  booktitle={ICCV},
  year={2023}
}

License

MOSE is licensed under a CC BY-NC-SA 4.0 License. The data of MOSE is released for non-commercial research purpose only.