Enhancing End-to-End Autonomous Driving with Latent World Model (ICLR 2025)

June 29, 2025 ยท View on GitHub

Yingyan Li, Lue Fan, Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang and Tieniu Tan

This Paper presents the LAtent World model (LAW), a self-supervised framework that predicts future scene features from current features and ego trajectories.

Alt text

๐Ÿ”ง Installation

1. Create a Conda Virtual Environment and Activate It

conda create -n law python=3.8 -y
conda activate law

2. Install PyTorch and torchvision

pip install -r requirements.txt
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html

3. Install MMCV-Full

pip install mmcv-full==1.4.0

4. Install MMDetection and MMSegmentation

pip install mmdet==2.14.0
pip install mmsegmentation==0.14.1
pip install timm

5. Install MMDetection3D

conda activate law
git clone https://github.com/open-mmlab/mmdetection3d.git
cd /path/to/mmdetection3d
git checkout -f v0.17.1
python setup.py develop

6. Install NuScenes DevKit

pip install nuscenes-devkit==1.1.9
pip install yapf==0.40.1

7. Download NuScenes Dataset and Pickle Files

For the pickle files, download the train and val files from VAD.

Organize your dataset as follows:

LAW
โ”œโ”€โ”€ projects/
โ”œโ”€โ”€ data/nuscenes
โ”‚   โ”œโ”€โ”€ can_bus/
โ”‚   โ”œโ”€โ”€ nuscenes/
โ”‚   โ”‚   โ”œโ”€โ”€ maps/
โ”‚   โ”‚   โ”œโ”€โ”€ samples/
โ”‚   โ”‚   โ”œโ”€โ”€ sweeps/
โ”‚   โ”‚   โ”œโ”€โ”€ v1.0-test/
โ”‚   โ”‚   โ”œโ”€โ”€ v1.0-trainval/
โ”‚   โ”‚   โ”œโ”€โ”€ vad_nuscenes_infos_temporal_train.pkl
โ”‚   โ”‚   โ”œโ”€โ”€ vad_nuscenes_infos_temporal_val.pkl

๐Ÿ‹๏ธโ€โ™‚๏ธ Training

./tools/nusc_my_train.sh law/default 8

๐Ÿ“Š Testing

./tools/dist_test $CONFIG $CKPT $NUM_GPU

๐Ÿ“ Results

MethodL2 (m) 1sL2 (m) 2sL2 (m) 3sL2 (m) Avg.Collision (%) 1sCollision (%) 2sCollision (%) 3sCollision (%) Avg.Log and Checkpoints
LAW (Perception-Free)0.280.580.990.620.100.150.380.21Google Drive

๐Ÿš€ Citation

Please consider citing our work as follows if it is helpful.

@misc{li2024enhancing,
      title={Enhancing End-to-End Autonomous Driving with Latent World Model}, 
      author={Yingyan Li and Lue Fan and Jiawei He and Yuqi Wang and Yuntao Chen and Zhaoxiang Zhang and Tieniu Tan},
      year={2024},
      eprint={2406.08481},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

More from Us

If you're interested in world models for autonomous driving, or looking for a world model codebase on NAVSIM, feel free to check out our latest work:

  • WoTE (ICCV 2025): Using BEV world models for online trajectory evaluation in end-to-end autonomous driving.