🌍 $I^2$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting

September 5, 2025 Β· View on GitHub

🌍 I2I^2-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting

arXiv

https://github.com/user-attachments/assets/693ba6d3-46ba-4529-ae3f-dc8a3456e4cc

πŸš€ News

  • [2025-06] I2I^2-World is accepted to ICCV 2025.

πŸ› οΈEnvironment

Install Pytorch 1.13 + CUDA 11.6

conda create --name ii-world python=3.8
conda activate ii-world
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116

Install mmdet3d (v1.0.0rc4) related packages and build this project

pip install mmcv-full==1.7.0 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html
pip install mmdet==2.28.2
pip install mmsegmentation==0.30.0
pip install mmengine
pip install -v -e .

Install other dependencies

pip install -r requirements.txt

πŸ€— Model Zoo

We utilize 8 RTX4090 GPUs to train our model.

MethodDatasetTaskRec.mIoU (%)Rec.IoU (%)Weights
II-TokenizerOcc3D-nusRec81.168.1Google-drive
STCOcc-ResRec24.832.2-
Occ3D-WaymoRec76.374.6-
II-WorldOcc3D-nusFore38.449.2Google-drive
STCOcc-ResFore18.928.8-
Occ3D-WaymoFore43.760.9-

πŸ“¦ Prepare Dataset

  1. Download nuScenes from nuScenes

  2. Download Occ3D-nus from Occ3D-nus

  3. (Optional) Download Occ3D-Waymo from Occ3D-Waymo and unzip it to the data/waymo folder. We only use the validation of Occ3D-Waymo in our experiments.

  4. (Optional) Download STCOcc-Res from STCOcc-Res and unzip it to the data/nuscenes folder.

  5. Download the generated info file from Google Drive and unzip it to the data/nuscenes folder. These *pkl files can be generated by running the tools/create_data.py

  6. (Optional) Download the visualization car model Google Drive

  7. Organize your folder structure as below:

β”œβ”€β”€ project
β”œβ”€β”€ visualizer/
β”‚   β”œβ”€β”€ 3d_model.obj/ (optional)
β”œβ”€β”€ ckpts/
β”‚   β”œβ”€β”€ ii_scene_tokenizer_4f.pth
β”‚   β”œβ”€β”€ ii_generate_world.pth
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ nuscenes/
β”‚   β”‚   β”œβ”€β”€ samples/ 
β”‚   β”‚   β”œβ”€β”€ v1.0-trainval/
β”‚   β”‚   β”œβ”€β”€ gts/ (Occ3D-nus)
β”‚   β”‚   β”œβ”€β”€ stc-results/ (prediction from STCOcc) (optional)
β”‚   β”‚   β”œβ”€β”€ world-nuscenes_infos_train.pkl
β”‚   β”‚   β”œβ”€β”€ world-nuscenes_infos_val.pkl
β”‚   β”œβ”€β”€ waymo(optional)/
β”‚   β”‚   β”œβ”€β”€ validation/ 
β”‚   β”‚   β”œβ”€β”€ cam_infos_vali.pkl/ 
β”‚   β”‚   β”œβ”€β”€ waymo_infos_val.pkl/ 

πŸŽ‡ Training and Evaluation

Train II-Tokenizer with 8GPUs:

bash tools/dist_train.sh configs/scene_tokenizer/ii_scene_tokenizer_4f.py 8

Evaluate II-Tokenizer with 6GPUs:

bash tools/dist_test.sh configs/scene_tokenizer/ii_scene_tokenizer_4f.py TO/CKPTS

Important

Before training or evaluating II-World, you should first evaluate the II-Tokenizer to generate the prediction tokens. By default, the II-Tokenizer will save the prediction tokens to data/nuscenes/save_dir/token_4f folder.

Also, make sure you utilize the number of GPU can be divided by 150 when evaluating the II-Tokenizer.

You can change the test_data_config in the tokenizer config for different datasets.

When generate the training set prediction tokens, you can set the ann_file in test_data_config to world-nuscenes_infos_train.pkl

Train II-World with 8GPUs:

bash tools/dist_train.sh configs/world_model/ii_generate_world.py 8

Evaluate II-World with 6GPUs:

bash tools/dist_test.sh configs/world_model/ii_generate_world.py TO/CKPTS

πŸŽ₯ Visualization

We provide a simple visualization to visualize the high-level control (utilize different cmd) of the world generation.

python tools/generate.py configs/world_model/ii_generate_world.py ckpts/ii_generate_world.pth \
--generate_path generate_output --generate_scene_name scene-0564 --generate_frame 12 --task_mode high-level-control

Also, you can visualize the generated world with the fine-grained control (utilize different transformation matrix)

python tools/generate.py configs/world_model/ii_generate_world.py ckpts/ii_generate_world.pth \
--generate_path generate_output --generate_scene_name scene-0270 --generate_frame 12 --task_mode generate

If you want to visualize the 3D occupancy map, you can set --save_npz in the script above, and the generated 3D occupancy npz will be saved in the generate_output folder.

Utilize the following command to visualize the generated 3D occupancy map:

python tools/vis_occ_3d.py --vis-single-data \PATH/TO/GENERATED/3D_OCCUPANCY.npz --vis-path demo_output

More visualization options can be found in the tools/vis_occ_3d.py file.

Acknowledgement

Thanks to the following excellent projects: