FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin

May 22, 2026 ยท View on GitHub


* Please note that the FPS here is measured with RTX3090 TensorRT FP16.

Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance Center


* Please note that the FPS here is measured with A100 GPU (PyTorch fp32 backend).

News

arXiv arXiv arXiv

This repository is an official implementation of FlashOCC


and Panoptic-FlashOCC


Main Results

1. FlashOCC

ConfigBackboneInput
Size
mIoUFPS
(Hz)
Flops
(G)
Params
(M)
ModelLog
BEVDetOCC (1f)R50256x70431.6092.1241.7629.02gdrivelog
M0: FlashOCC (1f)R50256x70431.95197.6154.139.94gdrivelog
M1: FlashOCC (1f)R50256x70432.08152.7248.5744.74gdrivelog
BEVDetOCC-4D-Stereo (2f)R50256x70436.1---baidulog
M2:FlashOCC-4D-Stereo (2f)R50256x70437.84---gdrivelog
BEVDetOCC-4D-Stereo (2f)Swin-B512x140842.0---baidulog
M3:FlashOCC-4D-Stereo (2f)Swin-B512x140843.52-1490.77144.99gdrivelog

FPS are tested via TensorRT on 3090 with FP16 precision. Please refer to Tab.2 in paper for the detail model settings for M-number.

2. Panoptic-FlashOCC

In Panoptic-FlashOCC, we have made the following 3 adjustments to FlashOCC:

  • Without using camera mask for training. This is because its use significantly improves the prediction performance in the visible region, but at the expense of prediction in the invisible region.
  • Using category balancing.
  • Using stronger loss settings.
  • Introducing instance center for panoptic occupancy

More results for different configurations will be released soon.

ConfigBackboneInput
Size
RayIouRayPQmIoUFPS
(Hz)
Flops
(G)
Params
(M)
ModelLog
M1: FlashOCC (1f)R50256x704--15.41-248.5744.74gdrivelog
Panoptic-FlashOCC-Depth-tiny (1f)R50256x70434.57-28.8343.9175.0045.32gdrivelog
Panoptic-FlashOCC-Depth-tiny-Pano (1f)R50256x70434.8112.929.1439.8175.0045.32gdrivelog
Panoptic-FlashOCC-Depth (1f)R50256x70434.93-28.9138.7269.4750.12gdrivelog
Panoptic-FlashOCC-Depth-Pano (1f)R50256x70435.2213.229.3935.2269.4750.12gdrivelog
Panoptic-FlashOCC-4D-Depth (2f)R50256x70435.99-29.5735.9--gdrivelog
Panoptic-FlashOCC-4D-Depth-Pano (2f)R50256x70436.7614.530.3130.4--gdrivelog
Panoptic-FlashOCC-4DLongterm-Depth (8f)R50256x70438.51-31.4935.6--gdrivelog
Panoptic-FlashOCC-4DLongterm-Depth-Pano (8f)R50256x70438.5016.031.5730.2--gdrivelog
  • Please note that the FPS here is measured with A100 GPU (PyTorch fp32 backend).

Get Started

  1. Environment Setup
  2. Model Training
  3. Quick Test Via TensorRT In MMDeploy
BackendmIOUFPS(Hz)
PyTorch-FP3231.95-
TRT-FP3230.7896.2
TRT-FP1630.78197.6
TRT-FP16+INT8(PTQ)29.60383.7
TRT-INT8(PTQ)29.59397.0
  1. Visualization
  • [flashocc] : A detail video can be found at baidu

  • [panoptic-flashocc] : first row is our prediction and second row is gt.


  1. TensorRT Implement Writen In C++ With Cuda Acceleration

Acknowledgement

Many thanks to the authors of BEVDet, FB-BEV, RenderOcc and SparseBEV

Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry.

@article{yu2024ultimatedo,
  title={UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height},
  author={Yu, Zichen and Shu, Changyong},
  journal={arXiv preprint arXiv:2409.11160},
  year={2024}
}

@article{yu2024panoptic,
  title={Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance Center},
  author={Yu, Zichen and Shu, Changyong and Sun, Qianpu and Linghu, Junjie and Wei, Xiaobao and Yu, Jiangyong and Liu, Zongdai and Yang, Dawei and Li, Hui and Chen, Yan},
  journal={arXiv preprint arXiv:2406.10527},
  year={2024}
}

@article{yu2023flashocc,
      title={FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin}, 
      author={Zichen Yu and Changyong Shu and Jiajun Deng and Kangjie Lu and Zongdai Liu and Jiangyong Yu and Dawei Yang and Hui Li and Yan Chen},
      year={2023},
      eprint={2311.12058},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}