Awesome Autonomous Driving (2026 Refresh)

May 30, 2026 · View on GitHub

Maintainers - Daehyun Ji (Samsung Electronics), Vertical AI 2 Team, AI Center Members in Samsung Electronics

I am looking for a maintainer! Let me know (captainzone@gmail.com) if interested.

Contributing

Please feel free to open pull requests to add papers, codebases, datasets, benchmarks, and courses.

Papers
Datasets and Benchmarks
Courses
Books
Videos
Software
Conference and Workshop Channels
Maintenance Notes

Papers

Overall Surveys

Self-Driving Cars: A Survey [Paper]
- Claudine Badue, Rânik Guidolini, Raphael Vivacqua Carneiro, Pedro Azevedo, Vinicius Brito Cardoso, Avelino Forechi, Luan Ferreira Reis Jesus, Rodrigo Ferreira Berriel, Thiago Meireles Paixão, Filipe Mutz, Thiago Oliveira-Santos, Alberto Ferreira De Souza
Planning and Decision-Making for Autonomous Vehicles [Paper]
- Wilko Schwarting, Javier Alonso-Mora, Daniela Rus
A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles [Paper]
- Brian Paden, Michal Čáp, Sze Zheng Yong, Dmitry Yershov, Emilio Frazzoli
A Survey for Foundation Models in Autonomous Driving [Paper]
- Haoxiang Gao, Zhongruo Wang, Yaqian Li, Kaiwen Long, Ming Yang, Yiqing Shen
A Survey of World Models for Autonomous Driving [Paper]
- Tuo Feng, Wenguan Wang, Yi Yang
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis [Paper]
- Mingyang Zhang, Haotian Wang, Yiduo Wang, et al.

Foundation Models, VLMs, LLMs, and World Models

DriveLM: Driving with Graph Visual Question Answering [Paper] [Code]
- Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li
Planning-Oriented Autonomous Driving [Paper] [Code]
- Tianyuan Hu, Li Chen, et al.
TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving [Paper] [Code]
- Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, Andreas Geiger
GAIA-1: A Generative World Model for Autonomous Driving [Paper]
- Wayve
DriveTransformer / DriveGPT-style driving-language works
- This area is moving quickly; keep an eye on VLM- and MLLM-based driving papers from CVPR, ICCV, ECCV, CoRL, and NeurIPS AD workshops.
World Models for Autonomous Driving: An Initial Survey [Paper]
- Chenhan Jiang, et al.

Classification / Representation Learning

ImageNet Classification with Deep Convolutional Neural Networks [Paper]
- Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton
Very Deep Convolutional Networks for Large-Scale Image Recognition [Paper]
- Karen Simonyan, Andrew Zisserman
Going Deeper with Convolutions [Paper]
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich
Deep Residual Learning for Image Recognition [Paper]
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Densely Connected Convolutional Networks [Paper]
- Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [Paper]
- Alexey Dosovitskiy, et al.
A ConvNet for the 2020s [Paper]
- Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie

2D Object Detection

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation [Paper]
- Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik
Fast R-CNN [Paper]
- Ross Girshick
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [Paper]
- Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
You Only Look Once: Unified, Real-Time Object Detection [Paper]
- Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
SSD: Single Shot MultiBox Detector [Paper]
- Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg
End-to-End Object Detection with Transformers [Paper]
- Nicolas Carion, et al.
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection [Paper]
- Hao Zhang, et al.

3D Object Detection and BEV Perception

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection [Paper]
- Yin Zhou, Oncel Tuzel
PointPillars: Fast Encoders for Object Detection from Point Clouds [Paper]
- Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom
SECOND: Sparsely Embedded Convolutional Detection [Paper]
- Yan Yan, Yuxing Mao, Bo Li
CenterPoint: Tracking Objects as Points [Paper]
- Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl
DETR3D: 3D Object Detection from Multi-View Images via 3D-to-2D Queries [Paper]
- Yue Wang, et al.
PETR: Position Embedding Transformation for Multi-View 3D Object Detection [Paper]
- Yilun Liu, et al.
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers [Paper] [Code]
- Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, Tong Lu, Qiao Yu, Jifeng Dai
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation [Paper] [Code]
- Zhijian Liu, Haotian Tang, Alexander Amini, Hanrui Wang, Song Han
Occupancy and BEV methods from 2023-2025
- See also occupancy and end-to-end sections below, because the field increasingly merges 3D detection, map perception, forecasting, and planning.

Object Tracking

Simple Online and Realtime Tracking [Paper]
- Alex Bewley, et al.
Simple Online and Realtime Tracking with a Deep Association Metric [Paper]
- Nicolai Wojke, Alex Bewley, Dietrich Paulus
AB3DMOT: A Baseline for 3D Multi-Object Tracking and New Evaluation Metrics [Paper] [Code]
- Xinshuo Weng, Jianren Wang, David Held, Kris Kitani
CenterTrack: Tracking Objects as Points [Paper]
- Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl

Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation [Paper]
- Jonathan Long, Evan Shelhamer, Trevor Darrell
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs [Paper]
- Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [Paper]
- Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [Paper]
- Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam
Pyramid Scene Parsing Network [Paper]
- Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers [Paper]
- Enze Xie, et al.
Masked-attention Mask Transformer for Universal Image Segmentation [Paper]
- Bowen Cheng, et al.

Depth Estimation

Unsupervised Monocular Depth Estimation with Left-Right Consistency [Paper] [Code]
- Clement Godard, Oisin Mac Aodha, Gabriel J. Brostow
Digging into Self-Supervised Monocular Depth Estimation [Paper]
- Clément Godard, Oisin Mac Aodha, Michael Firman, Gabriel J. Brostow
PackNet-SfM: 3D Packing for Self-Supervised Monocular Depth Estimation [Paper]
- Vitor Guizilini, et al.
DORN: Deep Ordinal Regression Network for Monocular Depth Estimation [Paper]
- Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, Dacheng Tao
Depth Anything [Paper] [Code]
- Lihe Yang, et al.
Depth Anything V2 [Paper] [Code]
- Lihe Yang, et al.

Occupancy Prediction and Scene Representation

SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving [Paper] [Code]
- Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu
Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving [Paper]
- Yiming Ge, et al.
TPVFormer: Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction [Paper]
- Yiming Huang, et al.
Occupancy Network / occupancy-based scene modeling papers (2023-2025)
- This is now a core subfield linking perception, forecasting, and world modeling.

Localization and Mapping

Visual SLAM Algorithms: A Survey from 2010 to 2016 [Paper]
- Takafumi Taketomi, Hideaki Uchiyama, Sei Ikeda
Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age [Paper]
- César Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, Jose Neira, Ian Reid, John J. Leonard
LOAM: Lidar Odometry and Mapping in Real-time [Paper]
- Ji Zhang, Sanjiv Singh
LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping [Paper] [Code]
- Tianyue Shan, Brendan Englot, et al.
ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multi-Map SLAM [Paper] [Code]
- Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. M. Montiel, Juan D. Tardós
FAST-LIO2: Fast Direct LiDAR-Inertial Odometry [Paper] [Code]
- Wei Xu, et al.

Visual Odometry

Review of Visual Odometry: Types, Approaches, Challenges, and Applications [Paper]
- Mohammad O. A. Aqel, Mohammad H. Marhaban, M. Iqbal Saripan, Napsiah Bt. Ismail
ORB-SLAM: A Versatile and Accurate Monocular SLAM System [Paper]
- Raúl Mur-Artal, J. M. M. Montiel, Juan D. Tardós
DF-VO: What Should Be Learnt for Visual Odometry? [Paper]
- Zhaoyang Lv, et al.
DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras [Paper] [Code]
- Zachary Teed, Jia Deng

Lane Detection and HD Map Learning

Towards End-to-End Lane Detection: An Instance Segmentation Approach [Paper]
- Davy Neven, Bert De Brabandere, Stamatios Georgoulis, Marc Proesmans, Luc Van Gool
Ultra Fast Structure-Aware Deep Lane Detection [Paper]
- Zequn Qin, et al.
LaneATT: Robust Multi-Lane Detection from Stereo or Monocular Input [Paper] [Code]
- Lucas Tabelini, Rodrigo Berriel, et al.
CLRNet: Cross Layer Refinement Network for Lane Detection [Paper] [Code]
- Tianheng Cheng, et al.
MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction [Paper] [Code]
- Bencheng Liao, et al.
StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Construction [Paper]
- Jiahao He, et al.

Motion Forecasting and Behavior Prediction

VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation [Paper]
- Li Liang, et al.
LaneGCN: Motion Forecasting with Lane Graph Convolutions [Paper]
- Ming Liang, Bin Yang, Rui Hu, Yun Chen, Raquel Urtasun
MTR: Motion Transformer for Motion Prediction [Paper]
- Shaoshuai Shi, et al.
Wayformer: Motion Forecasting via Simple and Efficient Attention Networks [Paper]
- Yixiao Wei, et al.
Scene Transformer: A Unified Architecture for Predicting Multiple Agent Trajectories [Paper]
- Junru Gu, et al.

Decision Making

Planning and Decision-Making for Autonomous Vehicles [Paper]
- Wilko Schwarting, Javier Alonso-Mora, Daniela Rus
Perception, Planning, Control, and Coordination for Autonomous Vehicles [Paper]
- R. K. Satzoda, Mohan M. Trivedi
A Behavioral Planning Framework for Autonomous Driving [Paper]
- Junqing Wei, Jarrod M. Snider, Tianyu Gu, John M. Dolan, Bakhtiar Litkouhi
Towards a Functional System Architecture for Automated Vehicles [Paper]
- Simon Ulbrich, Andreas Reschka, Jens Rieken, Susanne Ernst, Gerrit Bagschik, Frank Dierkes, Marcus Nolte, Markus Maurer

Planning

Optimal Trajectory Generation for Dynamic Street Scenarios in a Frenet Frame [Paper]
- Moritz Werling, Julius Ziegler, Sören Kammel, Sebastian Thrun
Path Planning for Autonomous Vehicles in Unknown Semi-Structured Environments [Paper]
- Dmitri Dolgov, et al.
Trajectory Planning for Bertha — A Local, Continuous Method [Paper]
- Julius Ziegler, Philipp Bender, Thao Dang, Christoph Stiller
Real-Time Motion Planning Methods for Autonomous On-Road Driving: State-of-the-Art and Future Research Directions [Paper]
- Christos Katrakazas, Mohammed Quddus, Wen-Hua Chen, Lipika Deka
A Review of Motion Planning Techniques for Automated Vehicles [Paper]
- David González, Joshué Pérez, Vicente Milanés, Fawzi Nashashibi
Towards Learning-Based Planning: The nuPlan Benchmark for Real-World Autonomous Driving [Paper]
- Holger Caesar, et al.

Control

Stanley: The Robot that Won the DARPA Grand Challenge [Paper]
- Sebastian Thrun, et al.
Automatic Steering Methods for Autonomous Automobile Path Tracking [Paper]
- Jarrod M. Snider
A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles [Paper]
- Brian Paden, Michal Čáp, Sze Zheng Yong, Dmitry Yershov, Emilio Frazzoli

End-to-End Driving

Learning by Cheating [Paper] [Code]
- Dian Chen, Brady Zhou, Vladlen Koltun, Philipp Krähenbühl
TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving [Paper] [Code]
- Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, Andreas Geiger
NEAT: Neural Attention Fields for End-to-End Autonomous Driving [Paper] [Code]
- Bernhard Jaeger, et al.
TCP: Trajectory-guided Control Prediction for End-to-End Autonomous Driving [Paper] [Code]
- Haotian Tang, et al.
Planning-Oriented Autonomous Driving [Paper] [Code]
- Tianyuan Hu, Li Chen, et al.
DriveLM: Driving with Graph Visual Question Answering [Paper] [Code]
- Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li

Reinforcement Learning in Autonomous Driving

Playing for Data: Ground Truth from Computer Games [Paper]
- Stephan R. Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun
Deep Reinforcement Learning for Autonomous Driving: A Survey [Paper]
- Kissan Tiwari, Bikash K. Dey, et al.
Benchmarking Reinforcement Learning for Autonomous Driving in CARLA
- Search terms: RL + CARLA + CoRL / NeurIPS / ICRA / IV for the latest policy-learning papers.

Datasets and Benchmarks

KITTI Vision Benchmark Suite [Website]
- Classical benchmark for stereo, optical flow, visual odometry, 3D object detection, and tracking.
Cityscapes [Website]
- Urban scene understanding benchmark with fine semantic annotations.
Mapillary Vistas [Website]
- Large-scale, geographically diverse street-scene parsing dataset.
BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning [Paper] [Website]
- Large-scale multi-task driving dataset.
Waymo Open Dataset [Website] [About]
- Large-scale perception, motion, scenario generation, and end-to-end driving benchmark ecosystem.
nuScenes [Website]
- Multi-sensor dataset for detection, tracking, segmentation, prediction, and map-related tasks.
nuPlan [Website] [Paper]
- Closed-loop planning benchmark with simulation and scenario-based evaluation.
Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting [Website] [Paper]
- Covers sensor, lidar, motion forecasting, map-change detection, and related tasks.
Waymo Open Motion Dataset / Waymax ecosystem [Waymax] [Docs]
- Useful for behavior prediction, sim agents, scenario generation, and closed-loop evaluation.
ApolloScape [Website]
- Includes scene parsing, car instance, lane segmentation, self-localization, and trajectory tasks.
SYNTHIA [Website]
- Synthetic dataset for semantic segmentation and related perception tasks.
Oxford RobotCar Dataset [Website]
- Long-term autonomy dataset across weather, season, and lighting changes.
Oxford Radar RobotCar Dataset [Website]
- Adds radar and odometry for robust localization and adverse-condition research.
KAIST Urban Dataset / MulRan-style Korean localization datasets
- Keep Korean-road and Korean-traffic-specific resources in this section where possible.

Courses

CS231n: Convolutional Neural Networks for Visual Recognition [Website]
Self-Driving Cars Specialization (University of Toronto / Coursera) [Website]
Introduction to Self-Driving Cars [Website]
Practical Deep Learning for Coders [Website]
Probabilistic Robotics and SLAM related graduate lectures
- Search with: SLAM / visual localization / motion planning / multi-agent forecasting lecture series.

Books

Deep Learning — Ian Goodfellow, Yoshua Bengio, Aaron Courville [Book]
Probabilistic Robotics — Sebastian Thrun, Wolfram Burgard, Dieter Fox [Book]
Planning Algorithms — Steven M. LaValle [Book]
Principles of Robot Motion: Theory, Algorithms, and Implementations — Howie Choset, et al. [Book]
Computer Vision: Algorithms and Applications — Richard Szeliski [Book]

Videos

Computer Vision Foundation (CVF) Open Access / YouTube [Channel]
ROSCon [Channel]
Autonomous Driving talks from Waymo Research / NVIDIA / Motional / CARLA Summit
Classical deep learning lectures
- Andrew Ng, Geoffrey Hinton, Yann LeCun, Yoshua Bengio

Software

ROS and Autonomous Driving Stacks

ROS 2 Documentation [Website]
ROS Home [Website]
Autoware [Website] [Docs] [GitHub]
- Open-source autonomous driving software stack built on ROS.

Frameworks and Toolboxes

PyTorch [Website]
TensorFlow [Website]
JAX [Website]
MMDetection3D [Docs] [GitHub]
OpenPCDet [GitHub]
Detectron2 [GitHub]
Nerfstudio [Website]
- Increasingly useful for 3D scene reconstruction and radiance-field-based research.

Simulation and Evaluation

CARLA Simulator [Website] [Docs] [GitHub]
Waymax [Website] [Docs] [GitHub]
nuPlan Devkit [GitHub]
CommonRoad [Website]
NVIDIA Isaac Sim [Website]

Conference and Workshop Channels

CVPR [Website]
ICCV [Website]
ECCV [Website]
NeurIPS [Website]
ICRA [Website]
IROS [Website]
IEEE Intelligent Vehicles Symposium (IV) [Website]
CoRL (Conference on Robot Learning) [Website]
Workshop keywords to watch
- autonomous driving, embodied AI, world models, behavior prediction, foundation models, simulation, safety validation

Maintenance Notes

Prefer full paper titles over abbreviations when adding new entries.
Prefer official project pages, official code repositories, and arXiv / OpenAccess links.
Mark deprecated toolchains clearly (for example, Torch7, Theano, Caffe2) instead of deleting history.
For 2026+, the most active update zones are:
- foundation models / VLM / LLM driving
- world models and scenario generation
- occupancy and BEV scene representation
- end-to-end driving
- motion forecasting and sim agents
- planning benchmarks and closed-loop evaluation

Suggested Next Cleanup for This Repository

Add tags such as Classic, Recommended, 2024+, Code, Benchmark, and Survey.
Split the README into papers.md, datasets.md, software.md, and courses.md if it becomes too long.
Add a small section for Korean-road / Korean-traffic-light / Korean-map resources.
Add benchmark tables for 3D detection, forecasting, planning, and end-to-end driving.