MODEL ZOO

April 30, 2022 · View on GitHub

Common settings and notes

The experiments are run with PyTorch 1.7.0, CUDA 10.1 and CUDNN 7.6
The training is conducted on 8 Telsa V100 GPUs
For the fade strategy proposed by PointAugmenting(disenable the copy-and-paste augmentation for the last 5 epochs), we currently implement this strategy by manually stop training at 15 epoch and resume the training without copy-and-paste augmentation. If you find more elegant ways to implement such strategy, please let we know and we really appreciate it. The fade strategy reduces lots of false positive, improving the mAP remarkably especially for TransFusion-L while having less influence on TransFusion.

Pretrained 2D Backbones

DLA34: Following PointAugmenting, we directly reuse the checkpoints pretrained on monocular 3D detection task provided by CenterNet.
ResNet50 on instance segmentation: We acquire the model pretrained on nuImages from MMDetection3D.
ResNet50 on 2D detection: We train a model using the config of instance segmentation but remove the mask head.

All the LiDAR-only models are trained in 20 epochs, the fusion-based models are further trained for 6 epochs from the pretrained LiDAR backbone. We freeze the weight of LiDAR backbone to save GPU memory.

Model	Backbone	mAP	NDS
TransFusion-L	PointPillars	54.51	62.66
TransFusion	PointPillars	60.21	65.50
TransFusion-L	VoxelNet	65.06	70.10
TransFusion	VoxelNet	67.49	71.28

nuScenes 3D Tracking

We perform tracking-by-detection with the same tracking algorithms proposed by CenterPoint.

Model	Backbone	AMOTA	AMOTP
TransFusion-L	VoxelNet	0.703	0.553
TransFusion	VoxelNet	0.725	0.561

nuScenes Leaderboard

Detection

We use 300 object queries during inference for online submission for a slightly better performance. We do not use any test-time-augmentation and model ensemble.

Model	Backbone	Test mAP	Test NDS	Link
TransFusion-L	VoxelNet	65.52	70.23	Detection
TransFusion	VoxelNet	68.90	71.68	Detection

Tracking

Model	Backbone	Test AMOTA	Test AMOTP	Link
TranFusion-L	VoxelNet	0.686	0.529	Detection / Tracking
TranFusion	VoxelNet	0.718	0.551	Detection / Tracking