COME: Adding Scene-Centric Forecasting Control to Occupancy World Model
June 19, 2025 ยท View on GitHub
Demo Videos
The comparison of ground-truth, DOME generation with official checkpoint and COME. The task setting is to use 4-frame 3D-Occ sequences as input and predict the next 6-frame (3-s prediction) sequences.
https://github.com/user-attachments/assets/f95890fb-ab5a-4f26-b9ec-3b5e44f45a99
The comparison of ground-truth, DOME generation with reproduced checkpoint and COME. The task setting is to use 4-frame 3D-Occ sequences as input and predict the next 16-frame (8-s prediction) sequences.
https://github.com/user-attachments/assets/4d1ec897-578c-469a-a1dd-a9b74f7eb3cf
The COME generation with BEV layouts. The task setting is to use 2-frame 3D-Occ sequences and 8-frame BEV layouts as input and predict the next 6-frame (3-s) sequences.
https://github.com/user-attachments/assets/e511e5df-71ad-42df-beab-c2725d0aad92
Overview
COME = Forecasting Guided Generation

Results

๐ Setup
environment setup
conda env create --file environment.yml
pip install einops tabulate
cd occforecasting
python setup.py develop
cd ..
data preparation
-
Create soft link from
data/nuscenesto your_nuscenes_path -
Prepare the gts semantic occupancy introduced in Occ3d
-
Download generated train/val pickle files from OccWorld or DOME.
-
Prepare the train/val pickle files for scene-centric forecasting.
python -m occforecasting.datasets.nusc_occ3d_dataset
The dataset should be organized as follows:
.
โโโ data/
โโโ nuscenes # downloaded from www.nuscenes.org/
โ โโโ lidarseg
โ โโโ maps
โ โโโ samples
โ โโโ sweeps
โ โโโ v1.0-trainval
โ โโโ gts # download from Occ3d
โโโ nuscenes_infos_train_temporal_v3_scene.pkl
โโโ nuscenes_infos_val_temporal_v3_scene.pkl
โโโ nuscenes_train_occ3d_infos.pkl
โโโ nuscenes_val_occ3d_infos.pkl
The four pickle files can also be downloaded in infos.
optinal inputs
For testing under different conditions, more inputs are needed.
-
motion planning results with yaw angles from BEVPlanner. Please put the json file on the project root directory. We simply add a yaw regression branch on BEV-Planner Project, Thanks for their great work.
-
BEV layouts for training and validation sets at 2Hz labels. Please unzip files and put them in './data/step2'. The pre-processing script is from UniScene, Thanks for their great work.
-
3D occupancy prediction results from BEVDet and EFFOcc. Please unzip files and put them in './data/occpreds'. Thanks for their open-source checkpoints.
-
AE evaluation protocol from UniScene, please download AE_checkpoint for request and put in './ckpts/'.
Model Zoos
We recommend to download checkpoints with folders under './work_dir'.
| Task Setting | Inputs | Method | Config | Checkpoint |
|---|---|---|---|---|
| Input-4frame-Output-6frame | 3DOcc + GT Traj | Stage1-COME-World Model | Config | CKPT |
| Input-4frame-Output-6frame | 3DOcc + GT Traj | Stage2-COME-Scene-Centric-Forecasting | Config | CKPT |
| Input-4frame-Output-6frame | 3DOcc + GT Traj | Stage3-COME-ControlNet | Config | CKPT |
| Input-4frame-Output-6frame | 3DOcc + Pred Traj | Stage3-COME-ControlNet | Config | Same As Above |
| Input-4frame-Output-6frame | BEVDet + Pred Traj | Stage3-COME-ControlNet | Config | Same As Above |
| Input-4frame-Output-6frame | BEVDet + GT Traj | Stage3-COME-ControlNet | Config | Same As Above |
| Input-4frame-Output-6frame | EFFOcc + Pred Traj | Stage3-COME-ControlNet | Config | Same As Above |
| Input-4frame-Output-6frame | EFFOcc + GT Traj | Stage3-COME-ControlNet | Config | Same As Above |
| Input-4frame-Output-16frame | 3DOcc + GT Traj | Stage1-COME-World Model | Config | CKPT |
| Input-4frame-Output-16frame | 3DOcc + GT Traj | Stage2-COME-Scene-Centric-Forecasting | Config | CKPT |
| Input-4frame-Output-16frame | 3DOcc + GT Traj | Stage3-COME-ControlNet | Config | CKPT |
| Input-2frame-Output-6frame | 3DOcc + GT Traj + BEV Layouts | Stage1-COME-World Model | Config | CKPT |
| Input-2frame-Output-6frame | 3DOcc + GT Traj + BEV Layouts | Stage2-COME-Scene-Centric-Forecasting | Config | CKPT |
| Input-2frame-Output-6frame | 3DOcc + GT Traj + BEV Layouts | Stage3-COME-ControlNet | Config | CKPT |
| Input-4frame-Output-6frame | 3DOcc + GT Traj | Stage1-COME-Small-World Model | Config | CKPT |
| Input-4frame-Output-6frame | 3DOcc + GT Traj | Stage2-COME-Scene-Centric-Forecasting | Config | Same As Above |
| Input-4frame-Output-6frame | 3DOcc + GT Traj | Stage3-COME-Small-ControlNet | Config | CKPT |
๐ Run the code
OCC-VAE
By default, we use the VAE checkpoint provided by DOME, thanks for their greak work.
# train
python tools/train_vae.py --py-config ./configs/train_occvae.py --work-dir ./work_dir/occ_vae
# eval
python tools/train_vae.py --py-config ./configs/train_occvae.py --work-dir ./work_dir/occ_vae --resume-from ckpts/occvae_latest.pth
# visualize
python tools/visualize_demo_vae.py \
--py-config ./configs/train_occvae.py \
--work-dir ./work_dir/occ_vae \
--resume-from ckpts/occvae_latest.pth \
--export_pcd \
--skip_gt
Scene-Centric Forecasting
cd occforecasting
# train
bash train.sh occforecasting/configs/unet/unet_aligned_past2s_future_3s.py
# eval
bash test.sh occforecasting/configs/unet/unet_aligned_past2s_future_3s.py
COME World Model
# train
python tools/train_diffusion.sh --py-config ./configs/train_dome_v2.py --work-dir ./work_dir/dome_v2
# eval
python tools/eval_metric.py --py-config ./configs/train_dome_v2.py --work-dir ./work_dir/dome_v2 --resume-from ./work_dir/dome_v2/best_miou.pth --vae-resume-from ckpts/occvae_latest.pth
# visualize
python tools/visualize_demo.py --py-config ./configs/train_dome_v2.py --work-dir ./work_dir/dome_v2 --resume-from ./work_dir/dome_v2/best_miou.pth --vae-resume-from ckpts/occvae_latest.pth
COME ControlNet
# train
python tools/train_diffusion_control_ddp.py --py-config configs/train_dome_controlnet_mask_invisible_v2.py --work-dir work_dir/train_dome_controlnet_mask_invisible_v2
# eval
python tools/test_diffusion_control.py --py-config configs/train_dome_controlnet_mask_invisible_v2.py --work-dir work_dir/train_dome_controlnet_mask_invisible_v2
# visualize
python tools/visualize_demo_control_mask_invisible.py --py-config configs/train_dome_controlnet_mask_invisible_v2.py --work-dir work_dir/train_dome_controlnet_mask_invisible_v2 --vae-resume-from ckpts/occvae_latest.pth --skip_gt
Acknoweldgements
This project is built on top of DOME and OccWorld. Thanks for the excellent open-source works!