README.md

June 6, 2025 · View on GitHub

Data

The table below shows the training annotations and their corresponding image and video sources download links:

Images (Label)

Dataset	Link
LVIS	https://cocodataset.org/#download (train2017)
obj365	https://www.objects365.org/overview.html
openimages	https://storage.googleapis.com/openimages/web/index.html
PACO	https://cocodataset.org/#download (train2017)
V3Det	https://v3det.openxlab.org.cn/

Images (Caption)

Dataset	Link
RefCOCO	https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco.zip
RefCOCO+	https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcoco+.zip
RefCOCOg	https://bvisionweb1.cs.unc.edu/licheng/referit/data/refcocog.zip
RefText	https://github.com/Buki2/STAN
Visual Genome	https://homes.cs.washington.edu/~ranjay/visualgenome/index.html
GRES	https://cocodataset.org/#download (train2014)
Google_Refexp	https://cocodataset.org/#download (train2014)
Rexverse-2M	https://huggingface.co/datasets/IDEA-Research/Rexverse-2M

Videos

The table below shows the training annotations and their corresponding video sources download links. Note, for each video source (.mp4), please first refer to extract_mp4_frames.py to extract frames.

Dataset	Link
A2D	https://kgavrilyuk.github.io/publication/actor_action/
BenSMOT	https://github.com/HengLan/SMOT
DAVIS17	https://davischallenge.org/davis2017/code.html
HC-STVG	https://github.com/tzhhhh123/HC-STVG
LV-VIS	https://github.com/haochenheheda/LVVIS
SA-V	https://ai.meta.com/datasets/segment-anything-video/
VidSTG	https://github.com/Guaranteer/VidSTG-Dataset
YoutubeVOS	https://youtube-vos.org/