README.md

April 17, 2026 · View on GitHub

RAD & RAD-2: Reinforcement Learning for Autonomous Driving

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

Hao Gao¹, Shaoyu Chen^2,†, Yifan Zhu², Yuehao Song¹, Wenyu Liu¹, Qian Zhang², Xinggang Wang^1,📧

¹ Huazhong University of Science and Technology, ² Horizon Robotics
^† Project lead ^📧 Corresponding author

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Hao Gao¹, Shaoyu Chen^1,2,†, Bo Jiang¹, Bencheng Liao¹, Yiang Shi¹, Xiaoyang Guo², Yuechuan Pu², Haoran Yin², Xiangyu Li², Xinbang Zhang², Ying Zhang², Wenyu Liu¹, Qian Zhang², Xinggang Wang^1,📧

¹ Huazhong University of Science and Technology, ² Horizon Robotics
^† Project lead ^📧 Corresponding author

📰 News

[2026.04.17] We have released the RAD-2 paper on arXiv. More details are available on our project homepage.
[2025.11.04] Good news! 🎉 The ReconDreamer-RL team has now open-sourced their reconstructed 3DGS environments based on nuScenes. You can find the release here: 👉 ReconDreamer-RL Environments
[2025.09.28] We have released core code for RL training.
[2025.09.18] RAD has been accepted by NeurIPS 2025! 🎉🎉🎉
[2025.02.18] We released our paper on Arxiv. Code are coming soon. Please stay tuned! ☕️

📌 RAD Training Discussion & Reference

We have created a central discussion issue for RAD training details. You can view and participate in the discussion here: RAD Training Details Issue. We also hope that the experiences and tips shared there will be helpful for your own RL-related training, not limited to RAD.

🎯 How to Use

Project Structure

.
├── data/                        # Action anchors for planning/control
├── compute_advantage.py         # Script for computing RL advantages and evaluation metrics
├── generate_action_anchor.py    # Script for generating action anchors for planning/control
├── planning_head.py             # Planning head module
├── utils.py                     # Utility functions for training and evaluation
└── README.md

Run Key Scripts

# You can quickly test the core functionality by running the provided scripts.
# Generate action anchors
python generate_action_anchor.py

# Run the planning head module
python planning_head.py

# Compute advantage metrics
python compute_advantage.py

Using Your Own Data

To integrate this project into your pipeline and use your own data, follow these steps:

Replace the Planning Head
Use planning_head.py to replace the head of your end-to-end algorithm.

Prepare the Closed-Loop Environment
Set up your closed-loop environment and collect closed-loop data.

Compute Advantages and Train the Model
Use compute_advantage.py to calculate advantage values from the collected data, and then use them for model training.

📚 Citation

If you find RAD useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{RAD,
  title={RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning},
  author={Gao, Hao and Chen, Shaoyu and Jiang, Bo and Liao, Bencheng and Shi, Yiang and Guo, Xiaoyang and Pu, Yuechuan and Yin, Haoran and Li, Xiangyu and Zhang, Xinbang and Zhang, Ying and Liu, Wenyu and Zhang, Qian and Wang, Xinggang},
  journal={arXiv preprint arXiv:2502.13144},
  year={2025}
}

@article{gao2026rad2,
  title={RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework},
  author={Gao, Hao and Chen, Shaoyu and Zhu, Yifan and Song, Yuehao and Liu, Wenyu and Zhang, Qian and Wang, Xinggang},
  journal={arXiv preprint arXiv:2604.15308},
  year={2026}
}