ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention

May 20, 2025 ยท View on GitHub

This repo is the official implementation of paper: ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention. It achieves 73.8 mAP L2 on Waymo Open Dataset val and 72.4 NDS on NuScenes val. The scatterformer achieve real-time speed of 23 FPS.

News

  • [24-06-21] Scatterformer is accepted by ECCV 2024!
  • [24-07-18] Training code released
  • [25-05-20] Added CUDA implementation for group-wise sparse convolution

Main results

Waymo Open Dataset validation

Model#SweepsmAP/H_L1mAP/H_L2Veh_L1Veh_L2Ped_L1Ped_L2Cyc_L1Cyc_L2Log
ScatterFormer (100%)181.8/79.775.7/73.881.0/80.573.1/72.784.5/79.977.0/72.679.9/78.977.1/76.1Log
ScatterFormer (20%)180.3/78.074.1/72.079.6/79.171.6/71.283.5/78.375.9/71.077.7/76.674.8/73.7Log

NuScenes validation

ModelmAPNDSmATEmASEmAOEmAVEmAAEckptLog
ScatterFormer68.372.426.524.524.723.318.8ckptLog

Usage

Installation

Please refer to INSTALL.md for installation.

Dataset Preparation

Please follow the instructions from OpenPCDet. We adopt the same data generation process.

Sparse Group-wise Convolution

ScatterFormer relies on a group-wise sparse convolution, please find this hacked version of [spconv] (https://github.com/skyhehe123/spconv)

We update the cuda implementation of group-wise sparse convolution in pcdet/ops/dw_spconv/. You need to compile the CUDA code:

cd pcdet/ops/dw_spconv
python setup.py build_ext --inplace

This will compile the CUDA implementation and you won't need to recompile the hacked version of spconv anymore.

Training

# multi-gpu training
cd tools
bash scripts/dist_train.sh 8 --cfg_file <CONFIG_FILE> [other optional arguments]

Testing

# multi-gpu testing
cd tools
bash scripts/dist_test.sh 8 --cfg_file <CONFIG_FILE> --ckpt <CHECKPOINT_FILE>