Getting Started

January 13, 2026 ยท View on GitHub

The OpenScene dataset is a large-scale dataset for end-to-end planning, visual pretraining, and occupancy prediction in the field of autonomous driving.

Based on nuPlan, we provide bounding box, occupancy, and flow annotations in 3D space.

OpenScene v1.1

:fire: Change Log

  • We reorganized the meta data files and organized them by their nuPlan log files to improve usability.
  • We added more logs that have sensor data and uploaded the LiDAR raw sensor data.

:exclamation: Must Read for the CVPR 2024 Challenge

  • For Track End-to-End Driving at Scale, please download the meta_data and the camera or/and LiDAR sensor data, depend on modalities you intend to use. Note that there is no separate competition track for camera-only planners.

  • For Track Predictive World Model, please download the meta_data, the camera and LiDAR sensor data.

  • The private test set utilized in the challenge leaderboards is exclusively provided by Motional and should not be confused with the test set.

  • It is important to note that the private test sets for the two tracks are distinct and do not share any data.

  • The input data (metadata, sensors) for the private test set will be accessible upon the opening of test server. The ground truth data will be only available on the test server operated by Motional.

Download Data

  • We recommended to download all data from Hugging Face or ModelScope for user in China.

  • The sensor data for both the trainval and test subsets amount to approximately 2TB. We recommend initially training and validating your models on the mini set.

  • :bell: For those who already possess the nuPlan sensor data (over 20TB) locally, you have the option to directly link it to the OpenScene folder to avoid redundant downloads. We carefully aligned the folder structure with nuPlan and just downsampled the nuPlan sensor data to improve the accessibility.

  • :bell: If you already have the OpenScene v1.0 image data, you can use it for OpenScene v1.1 as well, since almost (>98%) of all the data is present. If you want to use the occupancy label, please also download it from OpenScene v1.0. There are only a few instances of additional data in v1.1 that are missing. You can temporarily ignore those frames during training.

mini set

File NameDownload LinkSize
openscene_metadata_mini.tgzModelScope / Hugging Face509.6 MB
openscene_sensor_mini_cameraOpenXLab / ModelScope / Hugging Face84 GB
openscene_sensor_mini_lidarOpenXLab / ModelScope /Hugging Face60 GB

trainval set

File NameDownload LinkSize
openscene_metadata_trainval.tgzModelScope / Hugging Face6.6 GB
openscene_sensor_trainval_cameraOpenXLab / ModelScope / Hugging Face1.1 TB
openscene_sensor_trainval_lidarOpenXLab / ModelScope / Hugging Face822 GB

test set

File NameDownload LinkSize
openscene_metadata_test.tgzModelScope / Hugging Face454 MB
openscene_sensor_test_cameraOpenXLab / ModelScope / Hugging Face120 GB
openscene_sensor_test_lidarOpenXLab / ModelScope / Hugging Face87 GB

private test set

File NameDownload LinkSize
openscene_metadata_private_test_wm.tgzModelScope / Hugging Face7.3 MB
openscene_sensor_private_test_wm.tgzModelScope / Hugging Face15 GB
openscene_metadata_private_test_e2e.tgzModelScope / Hugging Face4 MB
openscene_sensor_private_test_e2e.tgzModelScope / Hugging Face23.6 GB
openscene_metadata_private_test_hard.tar.gzModelScope / Hugging Face180 KB
openscene_sensor_private_test_hard.tar.gzModelScope / Hugging Face636 MB
  • private_test_hard is the private test set for AGC 2025 NAVSIM-v2 challenge.
  • private_test_e2e is the private test set for End-to-End Driving at Scale track.
  • private_test_wm is the private test set for Predictive World Model track.
  • [2024-04-09] We fix some bugs and update the metadata of private_test_wm, please replace it!

Prepare Dataset

Please follow the steps below to get familiar with the OpenScene v1.1 dataset.

  1. Download all the data manually and unzip them.
  2. Make sure the filesystem hierarchy is the same as the dataset stats.
  3. Modify and run python DriveEngine/process_data/collect_data.py to collect the meta_data in any custom split.

OpenScene v1.0

Download Data

We recommended to download from OpenXLabOpenDriveLab and use provided command line interface (CLI) for acceleration. In addition, Google DriveGoogle Drive and Baidu CloudBaidu Yun are also available. If you already have the nuPlan dataset, you only need to download the label and meta data.

SubsetGoogle DriveGoogle DriveBaidu CloudBaidu YunApprox. Size
miniimage / labelimage / label81.2G / 6.7G
trainvalimage / labelimage / label1.1T / 95.4G
testimageimage118.5G
meta datameta filemeta file6.4G
  • Mini and trainval data contain three parts -- sensor_blobs (images), meta_data, and occupancy (label).

To ensure the integrity of the downloaded data, we recommend verifying the file using its MD5 checksum after the download is complete.

Train a Occupancy Prediction Model

Baseline

We provide a baseline model based on OccNet. The baseline is currently compatible with OpenScene dataset v1.0.

Train and Test

Train model with 4 GTX3090 GPUs

./tools/dist_train.sh ./projects/configs/bevformer/bev_tiny_occ_r50_nuplan.py 4

Eval model with 4 GTX3090 GPUs

./tools/dist_test.sh ./projects/configs/bevformer/bev_tiny_occ_r50_nuplan.py ./path/to/ckpts.pth 4

Visualization

See openscene_scenario_visualization.py