README.md

April 17, 2026 ยท View on GitHub

RAD & RAD-2: Reinforcement Learning for Autonomous Driving


RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

Hao Gao1, Shaoyu Chen2,โ€ , Yifan Zhu2, Yuehao Song1, Wenyu Liu1, Qian Zhang2, Xinggang Wang1,๐Ÿ“ง

1 Huazhong University of Science and Technology, ย  2 Horizon Robotics
โ€  Project lead ย  ๐Ÿ“ง Corresponding author


Project Page arXiv license


RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Hao Gao1, Shaoyu Chen1,2,โ€ , Bo Jiang1, Bencheng Liao1, Yiang Shi1, Xiaoyang Guo2, Yuechuan Pu2, Haoran Yin2, Xiangyu Li2, Xinbang Zhang2, Ying Zhang2, Wenyu Liu1, Qian Zhang2, Xinggang Wang1,๐Ÿ“ง

1 Huazhong University of Science and Technology, ย  2 Horizon Robotics
โ€  Project lead ย  ๐Ÿ“ง Corresponding author


Project Page arXiv license

๐Ÿ“ฐ News

  • [2026.04.17] We have released the RAD-2 paper on arXiv. More details are available on our project homepage.

  • [2025.11.04] Good news! ๐ŸŽ‰ The ReconDreamer-RL team has now open-sourced their reconstructed 3DGS environments based on nuScenes. You can find the release here: ๐Ÿ‘‰ ReconDreamer-RL Environments

  • [2025.09.28] We have released core code for RL training.

  • [2025.09.18] RAD has been accepted by NeurIPS 2025! ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰

  • [2025.02.18] We released our paper on Arxiv. Code are coming soon. Please stay tuned! โ˜•๏ธ

๐Ÿ“Œ RAD Training Discussion & Reference

We have created a central discussion issue for RAD training details. You can view and participate in the discussion here: RAD Training Details Issue. We also hope that the experiences and tips shared there will be helpful for your own RL-related training, not limited to RAD.

๐ŸŽฏ How to Use

  • Project Structure
.
โ”œโ”€โ”€ data/                        # Action anchors for planning/control
โ”œโ”€โ”€ compute_advantage.py         # Script for computing RL advantages and evaluation metrics
โ”œโ”€โ”€ generate_action_anchor.py    # Script for generating action anchors for planning/control
โ”œโ”€โ”€ planning_head.py             # Planning head module
โ”œโ”€โ”€ utils.py                     # Utility functions for training and evaluation
โ””โ”€โ”€ README.md
  • Run Key Scripts
# You can quickly test the core functionality by running the provided scripts.
# Generate action anchors
python generate_action_anchor.py

# Run the planning head module
python planning_head.py

# Compute advantage metrics
python compute_advantage.py
  • Using Your Own Data

To integrate this project into your pipeline and use your own data, follow these steps:

  1. Replace the Planning Head
    Use planning_head.py to replace the head of your end-to-end algorithm.

  2. Prepare the Closed-Loop Environment
    Set up your closed-loop environment and collect closed-loop data.

  3. Compute Advantages and Train the Model
    Use compute_advantage.py to calculate advantage values from the collected data, and then use them for model training.

๐Ÿ“š Citation

If you find RAD useful in your research or applications, please consider giving us a star ๐ŸŒŸ and citing it by the following BibTeX entry.

@article{RAD,
  title={RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning},
  author={Gao, Hao and Chen, Shaoyu and Jiang, Bo and Liao, Bencheng and Shi, Yiang and Guo, Xiaoyang and Pu, Yuechuan and Yin, Haoran and Li, Xiangyu and Zhang, Xinbang and Zhang, Ying and Liu, Wenyu and Zhang, Qian and Wang, Xinggang},
  journal={arXiv preprint arXiv:2502.13144},
  year={2025}
}

@article{gao2026rad2,
  title={RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework},
  author={Gao, Hao and Chen, Shaoyu and Zhu, Yifan and Song, Yuehao and Liu, Wenyu and Zhang, Qian and Wang, Xinggang},
  journal={arXiv preprint arXiv:2604.15308},
  year={2026}
}