Awesome Autonomous Driving (2026 Refresh)
May 30, 2026 · View on GitHub
Maintainers - Daehyun Ji (Samsung Electronics), Vertical AI 2 Team, AI Center Members in Samsung Electronics
I am looking for a maintainer! Let me know (captainzone@gmail.com) if interested.
Contributing
Please feel free to open pull requests to add papers, codebases, datasets, benchmarks, and courses.
Table of Contents
- Papers
- Overall Surveys
- Foundation Models, VLMs, LLMs, and World Models
- Classification / Representation Learning
- 2D Object Detection
- 3D Object Detection and BEV Perception
- Object Tracking
- Semantic Segmentation
- Depth Estimation
- Occupancy Prediction and Scene Representation
- Localization and Mapping
- Visual Odometry
- Lane Detection and HD Map Learning
- Motion Forecasting and Behavior Prediction
- Decision Making
- Planning
- Control
- End-to-End Driving
- Reinforcement Learning in Autonomous Driving
- Datasets and Benchmarks
- Courses
- Books
- Videos
- Software
- Conference and Workshop Channels
- Maintenance Notes
Papers
Overall Surveys
- Self-Driving Cars: A Survey [Paper]
- Claudine Badue, Rânik Guidolini, Raphael Vivacqua Carneiro, Pedro Azevedo, Vinicius Brito Cardoso, Avelino Forechi, Luan Ferreira Reis Jesus, Rodrigo Ferreira Berriel, Thiago Meireles Paixão, Filipe Mutz, Thiago Oliveira-Santos, Alberto Ferreira De Souza
- Planning and Decision-Making for Autonomous Vehicles [Paper]
- Wilko Schwarting, Javier Alonso-Mora, Daniela Rus
- A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles [Paper]
- Brian Paden, Michal Čáp, Sze Zheng Yong, Dmitry Yershov, Emilio Frazzoli
- A Survey for Foundation Models in Autonomous Driving [Paper]
- Haoxiang Gao, Zhongruo Wang, Yaqian Li, Kaiwen Long, Ming Yang, Yiqing Shen
- A Survey of World Models for Autonomous Driving [Paper]
- Tuo Feng, Wenguan Wang, Yi Yang
- Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis [Paper]
- Mingyang Zhang, Haotian Wang, Yiduo Wang, et al.
Foundation Models, VLMs, LLMs, and World Models
- DriveLM: Driving with Graph Visual Question Answering [Paper] [Code]
- Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li
- Planning-Oriented Autonomous Driving [Paper] [Code]
- Tianyuan Hu, Li Chen, et al.
- TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving [Paper] [Code]
- Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, Andreas Geiger
- GAIA-1: A Generative World Model for Autonomous Driving [Paper]
- Wayve
- DriveTransformer / DriveGPT-style driving-language works
- This area is moving quickly; keep an eye on VLM- and MLLM-based driving papers from CVPR, ICCV, ECCV, CoRL, and NeurIPS AD workshops.
- World Models for Autonomous Driving: An Initial Survey [Paper]
- Chenhan Jiang, et al.
Classification / Representation Learning
- ImageNet Classification with Deep Convolutional Neural Networks [Paper]
- Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton
- Very Deep Convolutional Networks for Large-Scale Image Recognition [Paper]
- Karen Simonyan, Andrew Zisserman
- Going Deeper with Convolutions [Paper]
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich
- Deep Residual Learning for Image Recognition [Paper]
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
- Densely Connected Convolutional Networks [Paper]
- Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [Paper]
- Alexey Dosovitskiy, et al.
- A ConvNet for the 2020s [Paper]
- Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie
2D Object Detection
- Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation [Paper]
- Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik
- Fast R-CNN [Paper]
- Ross Girshick
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [Paper]
- Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
- You Only Look Once: Unified, Real-Time Object Detection [Paper]
- Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
- SSD: Single Shot MultiBox Detector [Paper]
- Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg
- End-to-End Object Detection with Transformers [Paper]
- Nicolas Carion, et al.
- DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection [Paper]
- Hao Zhang, et al.
3D Object Detection and BEV Perception
- VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection [Paper]
- Yin Zhou, Oncel Tuzel
- PointPillars: Fast Encoders for Object Detection from Point Clouds [Paper]
- Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom
- SECOND: Sparsely Embedded Convolutional Detection [Paper]
- Yan Yan, Yuxing Mao, Bo Li
- CenterPoint: Tracking Objects as Points [Paper]
- Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl
- DETR3D: 3D Object Detection from Multi-View Images via 3D-to-2D Queries [Paper]
- Yue Wang, et al.
- PETR: Position Embedding Transformation for Multi-View 3D Object Detection [Paper]
- Yilun Liu, et al.
- BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers [Paper] [Code]
- Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, Tong Lu, Qiao Yu, Jifeng Dai
- BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation [Paper] [Code]
- Zhijian Liu, Haotian Tang, Alexander Amini, Hanrui Wang, Song Han
- Occupancy and BEV methods from 2023-2025
- See also occupancy and end-to-end sections below, because the field increasingly merges 3D detection, map perception, forecasting, and planning.
Object Tracking
- Simple Online and Realtime Tracking [Paper]
- Alex Bewley, et al.
- Simple Online and Realtime Tracking with a Deep Association Metric [Paper]
- Nicolai Wojke, Alex Bewley, Dietrich Paulus
- AB3DMOT: A Baseline for 3D Multi-Object Tracking and New Evaluation Metrics [Paper] [Code]
- Xinshuo Weng, Jianren Wang, David Held, Kris Kitani
- CenterTrack: Tracking Objects as Points [Paper]
- Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl
Semantic Segmentation
- Fully Convolutional Networks for Semantic Segmentation [Paper]
- Jonathan Long, Evan Shelhamer, Trevor Darrell
- Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs [Paper]
- Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille
- DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [Paper]
- Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [Paper]
- Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam
- Pyramid Scene Parsing Network [Paper]
- Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia
- SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers [Paper]
- Enze Xie, et al.
- Masked-attention Mask Transformer for Universal Image Segmentation [Paper]
- Bowen Cheng, et al.
Depth Estimation
- Unsupervised Monocular Depth Estimation with Left-Right Consistency [Paper] [Code]
- Clement Godard, Oisin Mac Aodha, Gabriel J. Brostow
- Digging into Self-Supervised Monocular Depth Estimation [Paper]
- Clément Godard, Oisin Mac Aodha, Michael Firman, Gabriel J. Brostow
- PackNet-SfM: 3D Packing for Self-Supervised Monocular Depth Estimation [Paper]
- Vitor Guizilini, et al.
- DORN: Deep Ordinal Regression Network for Monocular Depth Estimation [Paper]
- Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Batmanghelich, Dacheng Tao
- Depth Anything [Paper] [Code]
- Lihe Yang, et al.
- Depth Anything V2 [Paper] [Code]
- Lihe Yang, et al.
Occupancy Prediction and Scene Representation
- SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving [Paper] [Code]
- Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu
- Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving [Paper]
- Yiming Ge, et al.
- TPVFormer: Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction [Paper]
- Yiming Huang, et al.
- Occupancy Network / occupancy-based scene modeling papers (2023-2025)
- This is now a core subfield linking perception, forecasting, and world modeling.
Localization and Mapping
- Visual SLAM Algorithms: A Survey from 2010 to 2016 [Paper]
- Takafumi Taketomi, Hideaki Uchiyama, Sei Ikeda
- Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age [Paper]
- César Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, Jose Neira, Ian Reid, John J. Leonard
- LOAM: Lidar Odometry and Mapping in Real-time [Paper]
- Ji Zhang, Sanjiv Singh
- LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping [Paper] [Code]
- Tianyue Shan, Brendan Englot, et al.
- ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multi-Map SLAM [Paper] [Code]
- Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. M. Montiel, Juan D. Tardós
- FAST-LIO2: Fast Direct LiDAR-Inertial Odometry [Paper] [Code]
- Wei Xu, et al.
Visual Odometry
- Review of Visual Odometry: Types, Approaches, Challenges, and Applications [Paper]
- Mohammad O. A. Aqel, Mohammad H. Marhaban, M. Iqbal Saripan, Napsiah Bt. Ismail
- ORB-SLAM: A Versatile and Accurate Monocular SLAM System [Paper]
- Raúl Mur-Artal, J. M. M. Montiel, Juan D. Tardós
- DF-VO: What Should Be Learnt for Visual Odometry? [Paper]
- Zhaoyang Lv, et al.
- DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras [Paper] [Code]
- Zachary Teed, Jia Deng
Lane Detection and HD Map Learning
- Towards End-to-End Lane Detection: An Instance Segmentation Approach [Paper]
- Davy Neven, Bert De Brabandere, Stamatios Georgoulis, Marc Proesmans, Luc Van Gool
- Ultra Fast Structure-Aware Deep Lane Detection [Paper]
- Zequn Qin, et al.
- LaneATT: Robust Multi-Lane Detection from Stereo or Monocular Input [Paper] [Code]
- Lucas Tabelini, Rodrigo Berriel, et al.
- CLRNet: Cross Layer Refinement Network for Lane Detection [Paper] [Code]
- Tianheng Cheng, et al.
- MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction [Paper] [Code]
- Bencheng Liao, et al.
- StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Construction [Paper]
- Jiahao He, et al.
Motion Forecasting and Behavior Prediction
- VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation [Paper]
- Li Liang, et al.
- LaneGCN: Motion Forecasting with Lane Graph Convolutions [Paper]
- Ming Liang, Bin Yang, Rui Hu, Yun Chen, Raquel Urtasun
- MTR: Motion Transformer for Motion Prediction [Paper]
- Shaoshuai Shi, et al.
- Wayformer: Motion Forecasting via Simple and Efficient Attention Networks [Paper]
- Yixiao Wei, et al.
- Scene Transformer: A Unified Architecture for Predicting Multiple Agent Trajectories [Paper]
- Junru Gu, et al.
Decision Making
- Planning and Decision-Making for Autonomous Vehicles [Paper]
- Wilko Schwarting, Javier Alonso-Mora, Daniela Rus
- Perception, Planning, Control, and Coordination for Autonomous Vehicles [Paper]
- R. K. Satzoda, Mohan M. Trivedi
- A Behavioral Planning Framework for Autonomous Driving [Paper]
- Junqing Wei, Jarrod M. Snider, Tianyu Gu, John M. Dolan, Bakhtiar Litkouhi
- Towards a Functional System Architecture for Automated Vehicles [Paper]
- Simon Ulbrich, Andreas Reschka, Jens Rieken, Susanne Ernst, Gerrit Bagschik, Frank Dierkes, Marcus Nolte, Markus Maurer
Planning
- Optimal Trajectory Generation for Dynamic Street Scenarios in a Frenet Frame [Paper]
- Moritz Werling, Julius Ziegler, Sören Kammel, Sebastian Thrun
- Path Planning for Autonomous Vehicles in Unknown Semi-Structured Environments [Paper]
- Dmitri Dolgov, et al.
- Trajectory Planning for Bertha — A Local, Continuous Method [Paper]
- Julius Ziegler, Philipp Bender, Thao Dang, Christoph Stiller
- Real-Time Motion Planning Methods for Autonomous On-Road Driving: State-of-the-Art and Future Research Directions [Paper]
- Christos Katrakazas, Mohammed Quddus, Wen-Hua Chen, Lipika Deka
- A Review of Motion Planning Techniques for Automated Vehicles [Paper]
- David González, Joshué Pérez, Vicente Milanés, Fawzi Nashashibi
- Towards Learning-Based Planning: The nuPlan Benchmark for Real-World Autonomous Driving [Paper]
- Holger Caesar, et al.
Control
- Stanley: The Robot that Won the DARPA Grand Challenge [Paper]
- Sebastian Thrun, et al.
- Automatic Steering Methods for Autonomous Automobile Path Tracking [Paper]
- Jarrod M. Snider
- A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles [Paper]
- Brian Paden, Michal Čáp, Sze Zheng Yong, Dmitry Yershov, Emilio Frazzoli
End-to-End Driving
- Learning by Cheating [Paper] [Code]
- Dian Chen, Brady Zhou, Vladlen Koltun, Philipp Krähenbühl
- TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving [Paper] [Code]
- Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, Andreas Geiger
- NEAT: Neural Attention Fields for End-to-End Autonomous Driving [Paper] [Code]
- Bernhard Jaeger, et al.
- TCP: Trajectory-guided Control Prediction for End-to-End Autonomous Driving [Paper] [Code]
- Haotian Tang, et al.
- Planning-Oriented Autonomous Driving [Paper] [Code]
- Tianyuan Hu, Li Chen, et al.
- DriveLM: Driving with Graph Visual Question Answering [Paper] [Code]
- Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li
Reinforcement Learning in Autonomous Driving
- Playing for Data: Ground Truth from Computer Games [Paper]
- Stephan R. Richter, Vibhav Vineet, Stefan Roth, Vladlen Koltun
- Deep Reinforcement Learning for Autonomous Driving: A Survey [Paper]
- Kissan Tiwari, Bikash K. Dey, et al.
- Benchmarking Reinforcement Learning for Autonomous Driving in CARLA
- Search terms: RL + CARLA + CoRL / NeurIPS / ICRA / IV for the latest policy-learning papers.
Datasets and Benchmarks
- KITTI Vision Benchmark Suite [Website]
- Classical benchmark for stereo, optical flow, visual odometry, 3D object detection, and tracking.
- Cityscapes [Website]
- Urban scene understanding benchmark with fine semantic annotations.
- Mapillary Vistas [Website]
- Large-scale, geographically diverse street-scene parsing dataset.
- BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning [Paper] [Website]
- Large-scale multi-task driving dataset.
- Waymo Open Dataset [Website] [About]
- Large-scale perception, motion, scenario generation, and end-to-end driving benchmark ecosystem.
- nuScenes [Website]
- Multi-sensor dataset for detection, tracking, segmentation, prediction, and map-related tasks.
- nuPlan [Website] [Paper]
- Closed-loop planning benchmark with simulation and scenario-based evaluation.
- Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting [Website] [Paper]
- Covers sensor, lidar, motion forecasting, map-change detection, and related tasks.
- Waymo Open Motion Dataset / Waymax ecosystem [Waymax] [Docs]
- Useful for behavior prediction, sim agents, scenario generation, and closed-loop evaluation.
- ApolloScape [Website]
- Includes scene parsing, car instance, lane segmentation, self-localization, and trajectory tasks.
- SYNTHIA [Website]
- Synthetic dataset for semantic segmentation and related perception tasks.
- Oxford RobotCar Dataset [Website]
- Long-term autonomy dataset across weather, season, and lighting changes.
- Oxford Radar RobotCar Dataset [Website]
- Adds radar and odometry for robust localization and adverse-condition research.
- KAIST Urban Dataset / MulRan-style Korean localization datasets
- Keep Korean-road and Korean-traffic-specific resources in this section where possible.
Courses
- CS231n: Convolutional Neural Networks for Visual Recognition [Website]
- Self-Driving Cars Specialization (University of Toronto / Coursera) [Website]
- Introduction to Self-Driving Cars [Website]
- Practical Deep Learning for Coders [Website]
- Probabilistic Robotics and SLAM related graduate lectures
- Search with: SLAM / visual localization / motion planning / multi-agent forecasting lecture series.
Books
- Deep Learning — Ian Goodfellow, Yoshua Bengio, Aaron Courville [Book]
- Probabilistic Robotics — Sebastian Thrun, Wolfram Burgard, Dieter Fox [Book]
- Planning Algorithms — Steven M. LaValle [Book]
- Principles of Robot Motion: Theory, Algorithms, and Implementations — Howie Choset, et al. [Book]
- Computer Vision: Algorithms and Applications — Richard Szeliski [Book]
Videos
- Computer Vision Foundation (CVF) Open Access / YouTube [Channel]
- ROSCon [Channel]
- Autonomous Driving talks from Waymo Research / NVIDIA / Motional / CARLA Summit
- Classical deep learning lectures
- Andrew Ng, Geoffrey Hinton, Yann LeCun, Yoshua Bengio
Software
ROS and Autonomous Driving Stacks
- ROS 2 Documentation [Website]
- ROS Home [Website]
- Autoware [Website] [Docs] [GitHub]
- Open-source autonomous driving software stack built on ROS.
Frameworks and Toolboxes
- PyTorch [Website]
- TensorFlow [Website]
- JAX [Website]
- MMDetection3D [Docs] [GitHub]
- OpenPCDet [GitHub]
- Detectron2 [GitHub]
- Nerfstudio [Website]
- Increasingly useful for 3D scene reconstruction and radiance-field-based research.
Simulation and Evaluation
- CARLA Simulator [Website] [Docs] [GitHub]
- Waymax [Website] [Docs] [GitHub]
- nuPlan Devkit [GitHub]
- CommonRoad [Website]
- NVIDIA Isaac Sim [Website]
Conference and Workshop Channels
- CVPR [Website]
- ICCV [Website]
- ECCV [Website]
- NeurIPS [Website]
- ICRA [Website]
- IROS [Website]
- IEEE Intelligent Vehicles Symposium (IV) [Website]
- CoRL (Conference on Robot Learning) [Website]
- Workshop keywords to watch
- autonomous driving, embodied AI, world models, behavior prediction, foundation models, simulation, safety validation
Maintenance Notes
- Prefer full paper titles over abbreviations when adding new entries.
- Prefer official project pages, official code repositories, and arXiv / OpenAccess links.
- Mark deprecated toolchains clearly (for example, Torch7, Theano, Caffe2) instead of deleting history.
- For 2026+, the most active update zones are:
- foundation models / VLM / LLM driving
- world models and scenario generation
- occupancy and BEV scene representation
- end-to-end driving
- motion forecasting and sim agents
- planning benchmarks and closed-loop evaluation
Suggested Next Cleanup for This Repository
- Add tags such as
Classic,Recommended,2024+,Code,Benchmark, andSurvey. - Split the README into
papers.md,datasets.md,software.md, andcourses.mdif it becomes too long. - Add a small section for Korean-road / Korean-traffic-light / Korean-map resources.
- Add benchmark tables for 3D detection, forecasting, planning, and end-to-end driving.