ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention

May 20, 2025 · View on GitHub

This repo is the official implementation of paper: ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention. It achieves 73.8 mAP L2 on Waymo Open Dataset val and 72.4 NDS on NuScenes val. The scatterformer achieve real-time speed of 23 FPS.

News

Model	#Sweeps	mAP/H_L1	mAP/H_L2	Veh_L1	Veh_L2	Ped_L1	Ped_L2	Cyc_L1	Cyc_L2	Log
ScatterFormer (100%)	1	81.8/79.7	75.7/73.8	81.0/80.5	73.1/72.7	84.5/79.9	77.0/72.6	79.9/78.9	77.1/76.1	Log
ScatterFormer (20%)	1	80.3/78.0	74.1/72.0	79.6/79.1	71.6/71.2	83.5/78.3	75.9/71.0	77.7/76.6	74.8/73.7	Log

Model	mAP	NDS	mATE	mASE	mAOE	mAVE	mAAE	ckpt	Log
ScatterFormer	68.3	72.4	26.5	24.5	24.7	23.3	18.8	ckpt	Log

Please refer to INSTALL.md for installation.

Please follow the instructions from OpenPCDet. We adopt the same data generation process.

~~ScatterFormer relies on a group-wise sparse convolution, please find this hacked version of [spconv] (https://github.com/skyhehe123/spconv)~~

We update the cuda implementation of group-wise sparse convolution in pcdet/ops/dw_spconv/. You need to compile the CUDA code:

cd pcdet/ops/dw_spconv
python setup.py build_ext --inplace

This will compile the CUDA implementation and you won't need to recompile the hacked version of spconv anymore.

# multi-gpu training
cd tools
bash scripts/dist_train.sh 8 --cfg_file <CONFIG_FILE> [other optional arguments]

# multi-gpu testing
cd tools
bash scripts/dist_test.sh 8 --cfg_file <CONFIG_FILE> --ckpt <CHECKPOINT_FILE>