Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving

September 7, 2025 ยท View on GitHub

This is the official repository of

Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving

arXiv PDF

teaser

๐Ÿ”ฅPowered by Hydra, Pytorch-lightinig, Tensorboard and Waymax.

Authors: Lingyu Xiao, Jiang-Jiang Liu, Sen Yang, Xiaofan Li, Xiaoqing Ye, Wankou Yang and Jingdong Wang

๐Ÿ“ฐ News

Jan. 2025

  • ๐ŸŽ‰ Accepted by ICRA 2025!

Dec. 2024

  • Released code for training and testing

Oct. 2024

๐Ÿ’ก Highlights

  • The first open-source planner verfied on Waymax.

  • Integrated Jax-based close-loop simulation with torch-based model training. (Supported parallel in GPUs)

  • Full pipeline to kick-start on waymax, data collecting, model training and simulation.

  • Expert-level performance under the close-loop evaluation.

๐Ÿ“ Abstract

The autoregressive world model exhibits robust generalization capabilities in vectorized scene understanding but encounters difficulties in deriving actions due to insufficient uncertainty modeling and self-delusion. In this paper, we explore the feasibility of deriving decisions from an autoregressive world model by addressing these challenges through the formulation of multiple probabilistic hypotheses. We propose LatentDriver, a framework models the environmentโ€™s next states and the ego vehicleโ€™s possible actions as a mixture distribution, from which a deterministic control signal is then derived. By incorporating mixture modeling, the stochastic nature of decision- making is captured. Additionally, the self-delusion problem is mitigated by providing intermediate actions sampled from a distribution to the world model. Experimental results on the recently released close-loop benchmark Waymax demonstrate that LatentDriver surpasses state-of-the-art reinforcement learning and imitation learning methods, achieving expert-level performance.

๐Ÿ› ๏ธ Quick Start

๐Ÿ“Š Main results and weights

The weights can be found here

Performance under reactive agents

ModelmAR[95:75]AR[95:75]ORCRPR
PlanT75.8687.012.293.0895.38
Easychauffeur-PPO78.7288.663.954.7298.26
LatentDriver(T=2, J=4)89.3193.792.593.2299.50
LatentDriver(T=2, J=3)90.1494.312.223.1399.64

Performance under non-reactive agents

ModelmAR[95:75]AR[95:75]ORCRPR
PlanT75.3487.392.153.895.11
Easychauffeur-PPO78.3388.213.544.8297.77
LatentDriver(T=2, J=4)89.6394.822.582.3199.55
LatentDriver(T=2, J=3)90.3895.542.22.0399.68

๐Ÿ“ˆ Additional ablation studies

T and J is chose empirically according to the experiment below.

๐Ÿค” Trouble Shooting

Most of the problems may be caused by the JAX environment. Here are some common problems and solutions.

Q1: The process is accidentally killed when running preprocess_data.sh.
A1: You can reduce your batch size in preprocess_data.sh to avoid memory overflow.

Q2: "Jax [WARNING] No GPU/TPU found" when running preprocess_data.sh.
A2: It is normal, when preprocessing data, we do not need GPU. You can use python tools/quick_check.py to check if your simulation environment is correctly set up.

Q3: During simulation, is it normal that the memory usage is high but power usage is low?
A3: Yes, it is normal. And this need to be optimized in JAX.

TODOs

  • Training code & data for EasyChauffeur-PPO.
  • Collecting and loading specific scenario for simulation.
  • Training code for LatentDriver and PlanT.
  • Weights for LatentDriver, PlanT and EasyChauffeur-PPO.
  • Data collecting code.
  • Code for identifying WOMD scene's types.

Citation

If you find our work is useful, please consider citing and ๐ŸŒŸ us:

@INPROCEEDINGS{11127996,
  author={Xiao, Lingyu and Liu, Jiang-Jiang and Yang, Sen and Li, Xiaofan and Ye, Xiaoqing and Yang, Wankou and Wang, Jingdong},
  booktitle={2025 IEEE International Conference on Robotics and Automation (ICRA)}, 
  title={Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving}, 
  year={2025},
  volume={},
  number={},
  pages={1279-1285},
  keywords={Uncertainty;Imitation learning;Refining;Decision making;Stochastic processes;Reinforcement learning;Benchmark testing;Predictive models;Probabilistic logic;Robotics and automation},
  doi={10.1109/ICRA55743.2025.11127996}}

@article{xiao2024learning,
  title={Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving},
  author={Xiao, Lingyu and Liu, Jiang-Jiang and Yang, Sen and Li, Xiaofan and Ye, Xiaoqing and Yang, Wankou and Wang, Jingdong},
  journal={arXiv preprint arXiv:2409.15730},
  year={2024}
}
@article{xiao2024easychauffeur,
  title={EasyChauffeur: A Baseline Advancing Simplicity and Efficiency on Waymax},
  author={Xiao, Lingyu and Liu, Jiang-Jiang and Ye, Xiaoqing and Yang, Wankou and Wang, Jingdong},
  journal={arXiv preprint arXiv:2408.16375},
  year={2024}
}

Acknowledgement