README.md

March 11, 2026 Β· View on GitHub

🏎️ Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving

Minhao Xiong1*, Zichen Wen1,2*, Zhuangcheng Gu3, Xuyang Liu4,
Rui Zhang2, Hengrui Kang1,2, Jiabing Yang5, Junyuan Zhang6,2,
Weijia Li7,2, Conghui He2, Yafei Wang1, Linfeng Zhang1†,

1Shanghai Jiao Tong University, 2Shanghai AI Laboratory, 3Carnegie Mellon University, 4Sichuan University,
5University of Chinese Academy of Sciences, 6The University of Hong Kong, 7Sun Yat-sen University

*Equal contribution, †Corresponding author

arXiv License

πŸ”₯ News

  • [2026.02] πŸŽ‰ Prune2Drive has been accepted by CVPR 2026!

πŸ‘€ Overview

  • 🌟 We present T-FPS, a lightweight token pruning method inspired by Farthest Point Sampling, designed to preserve semantic and spatial diversity by selecting the most representative visual tokens across multi-view inputs.
  • 🌟 We develop a view-adaptive pruning ratio optimization framework that automatically assigns distinct pruning ratios to each camera view, enabling fine-grained control over the trade-off between perception completeness and computational efficiency.
  • 🌟 We conduct comprehensive experiments on two real-world autonomous driving benchmarks, DriveLM and DriveLMM-o1. With only 10% retained tokens, our method achieves minimal performance degradation while significantly reducing computational cost.

overview

πŸ“Š Performance

  • 🎯 Prune2Drive achieves only 3% and 6% performance drop on DriveLM and DriveLMM-o1 benchmarks, respectively, while consuming merely 13.4% and 20.3% of the original FLOPs.
  • 🎯 The proposed view-adaptive pruning strategy effectively balances perception accuracy and efficiency, outperforming uniform pruning baselines and highlighting the importance of adaptive token selection in multi-view scenarios.

drivelm

drivelmm

πŸ›  Getting Started

  1. Clone this repository.
  git clone https://github.com/qqqqiiuuss/Prune2Drive
  cd Prune2Drive
  1. Download models and datasets
  cd ckpt && huggingface-cli download DriveMM/DriveMM --local-dir ./DriveMM --resume-download

Download nuscenes dataset and place it to scripts/data folder [Around 70 GB]

  1. Environment setup
  conda create -n prune2drive python=3.10 -y
  conda activate prune2drive
  pip install --upgrade pip  # Enable PEP 660 support.
  pip install -e ".[train]"
  cd transformers
  pip install -e .

Inference

  cd scripts
  # Run DriveLM inference pipeline and generate submission.json 
  bash run_inference.sh [CUDA_DEVICES] [OUTPUT_SAVE_PATH]

Pruning Ratio Optimization

  cd scripts
  # Run  
  python run_drive_nni.py

  # View pruning ratio optimization outputs
  nnictl view YOUR_ID --experiment_dir YOUR_OUTPUT_DIR --port YOUR_PORT_ID

πŸ“ˆ Evaluation Guidance

The DriveLM competition server is held on Hugging Face space.

  1. Prepare your result

    Open prepare_submission.py in OUTPUT_SAVE_PATH and fill in the following information starting line 4:

    method = ""  # <str> -- name of the method
    team = ""  # <str> -- name of the team, !!!identical to the Google Form!!!
    authors = [""]  # <list> -- list of str, authors
    email = ""  # <str> -- e-mail address
    institution = ""  # <str> -- institution or company
    country = ""  # <str> -- country or region
    

    While other fields can change between different submissions, make sure you always use your team name submitted on Google registration form for the team field, NOT the anonymous team name to be shown on the leaderboard. Then run this file:

    # make sure you are under ./challenge
    python prepare_submission.py
    

    This will generate submission.json with your information and result.

  2. Upload your result as a Hugging Face model

    Click your profile picture on the top right of the Hugging Face website, and select + New Model. Create a new model repository, and upload the submission.json file.

    Note that private models are also acceptable by the competition space.

  3. Submit your result and evaluate on test server

    Go to competition space, click New Submission on the left panel. Paste the link of the Hugging Face model you created under Hub model. Then click Submit.

    Note: you can make up to 3 submissions per day.

You can check the status of your submissions in the My submissions tab of the competition space.

Please refer to these slides for explaination of each score.

You can select a submission and click Update Selected Submissions on the bottom to update its evaluation status to the private leaderboard. Please note that public score and private score are exactly the same in our case. So please ignore the descriptions in My Submissions tab.

πŸ“Œ TODO

  • Release Inference and Evaluation Code
  • Release Multi-view Pruning Ratio Optimization Code

πŸ”‘ License

This project is released under the Apache 2.0 license.

πŸ“ Citation

Please consider citing our paper in your publications if our works help your research.

@article{xiong2025prune2driveplugandplayframeworkaccelerating,
      title={Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving}, 
      author={Minhao Xiong and Zichen Wen and Zhuangcheng Gu and Xuyang Liu and Rui Zhang and Hengrui Kang and Jiabing Yang and Junyuan Zhang and Weijia Li and Conghui He and Yafei Wang and Linfeng Zhang},
      year={2025},
      eprint={2508.13305},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.13305}, 
}

πŸ‘ Acknowledgments

We would like to express our sincere gratitude to the open-source contributions from the teams behind DriveLM and DriveMM