DAOcc

March 26, 2026 ยท View on GitHub

PWC

DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction [arxiv]
Zhen Yang, Yanpeng Dong, Jiayu Wang
Beijing Mechanical Equipment Institute, Beijing, China

This is the official implementation of DAOcc. DAOcc is a novel multi-modal occupancy prediction framework that leverages 3D object detection to assist in achieving superior performance while using a deployment-friendly image encoder and practical input image resolution.

News

  • 2025-09-09: DAOcc is accepted to TCSVT โ€” cue the confetti! ๐ŸŽ‰
  • 2025-07-20: We have open-sourced the TensorRT inference code for DAOcc, achieving 54.25 mIoU at 104.9 FPS. Check it out here.
  • 2025-07-11: DAOcc achieved 54.33 mIoU on Occ3D-nuScenes without EMA.
  • 2025-04-24: Following SparseBEV, we optimized the 2D-to-3D image feature transformation process, achieving substantial reductions in GPU memory consumption while slightly reducing training time. Check the config file.
  • 2025-01-31: Release the model weights and the first version of the code.
  • 2024-10-01: Our preprint is available on arXiv.

Experimental results

3D Semantic Occupancy Prediction on Occ3D-nuScenes

MethodCamera
Mask
Image
Backbone
Image
Resolution
mIoUConfigModelLog
DAOccโˆšR50256ร—70454.33configmodellog
MethodCamera
Mask
Image
Backbone
Image
Resolution
RayIoUConfigModelLog
DAOccร—R50256ร—70448.4configmodellog
Deprecated results (archived)
Method
Camera
Mask
Image
Backbone
Image
Resolution
mIoUConfigModelLog
DAOccโˆšR50256ร—70453.82configmodellog
DAOcc*โˆšR50256ร—70454.19-model-
MethodCamera
Mask
Image
Backbone
Image
Resolution
RayIoUConfigModelLog
DAOccร—R50256ร—70448.2configmodellog

3D Semantic Occupancy Prediction on SurroundOcc

MethodImage
Backbone
Image
Resolution
IoUmIoUConfigModelLog
DAOccR50256ร—70445.030.5configmodellog

3D Semantic Occupancy Prediction on OpenOccupancy

MethodImage
Backbone
Image
Resolution
IoUmIoUConfigModelLog
DAOccR18256ร—70432.224.1configmodellog

3D Semantic Occupancy Prediction on Occ3D-Waymo

MethodCamera
Mask
Infov
Mask
Image
Backbone
Image
Resolution
mIoUConfigModelLog
DAOccโˆšโˆšR50256ร—70444.69config-log
DAOcc*โˆšโˆšR50256ร—70445.13---
  • The * means using exponential moving average (EMA) hook.
  • For Occ3D-Waymo, we use only 20% of the training data.

Getting Started

TensorRT Deployment

We provide deployment details of DAOcc, including converting the Torch model to ONNX format and building a TensorRT (TRT) engine from the ONNX model. For specific details, please refer to CUDA_DAOcc.

ModelPrecisionHardwaremIoUFPS
DAOccFP16+INT8AGX Orin (64GB)53.7020.0

Citation

@article{yang2025daocc,
  title={Daocc: 3d object detection assisted multi-sensor fusion for 3d occupancy prediction},
  author={Yang, Zhen and Dong, Yanpeng and Wang, Jiayu and Wang, Heng and Ma, Lichao and Cui, Zijian and Liu, Qi and Pei, Haoran and Zhang, Kexin and Zhang, Chao},
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
  year={2025},
  publisher={IEEE}
}

Acknowledgements

Many thanks to these excellent open-source projects: