ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder
July 21, 2025 Β· View on GitHub
ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder
Jungho Kim*, Changwon Kang*, Dongyoung Lee*, Sehwan Choi, Jun Won Choiβ
*: Equal Contribution, β : Corresponding Author
South Korea
AAAI 2025
π News
- [2025/07]: We released the full code & checkpoints of ProtoOcc, including nuScenes (Single & Multi frame) and SemanticKITTI.
- [2024/12]: ProtoOcc is accepted at AAAI 2025. π₯
- [2024/08]: ProtoOcc achieves the SOTA on Occ3D-nuScenes with 45.02% mIoU (Multi-frame) and 39.56% mIoU, 12.83 FPS (Single-frame)!
π½οΈ Demo

π‘ Method
Overall structure of ProtoOcc. (a) Dual Branch Encoder captures fine-grained 3D structures and models the large receptive fields in voxel and BEV domains, respectively. (b) The Prototype Query Decoder generates Scene-Aware Queries utilizing prototypes and achieves fast inference without iterative query decoding. (c) Our ProtoOcc framework integrates Dual Branch Encoder and Prototype Mask Decoder for 3D occupancy prediction.
β‘ Main Result
nuScenes Result
| Config | Temporal | Backbone | Input Size | Pooling Method | mIoU | Hugging | |
|---|---|---|---|---|---|---|---|
| ProtoOcc_1key | 1 Frame | R50 | 256x704 | BEVDepth | 39.56 | link | link |
| ProtoOcc_longterm | 8 Frames | R50 | 256x704 | BEVStereo | 45.02 | link | link |
Semantic-KITTI Result
| Config | Temporal | Backbone | Input Size | Pooling Method | mIoU | Hugging | |
|---|---|---|---|---|---|---|---|
| ProtoOcc_semanticKITTI | 1 Frame | R50 | 384x1280 | BEVDepth | 13.89 | link | link |
π Training & Evaluation
Training
We trained all models using four RTX 3090 (24GB) GPUs.
CONFIG=ProtoOcc_1key # (ProtoOcc_1key / ProtoOcc_longterm / ProtoOcc_semanticKITTI)
./tools/dist_train.sh projects/configs/ProtoOcc/${CONFIG}.py 4 --work-dir ./work_dirs/${CONFIG}
Evaluation
If you want to get the pretrained weights, download them from Google Drive or Hugging Face.
To measure inference speed, uncomment # fp16 = dict(loss_scale='dynamic') in the config file.
CONFIG=ProtoOcc_1key # (ProtoOcc_1key / ProtoOcc_longterm / ProtoOcc_semanticKITTI)
bash tools/dist_test.sh ./projects/configs/${CONFIG}.py ./work_dirs/${CONFIG}/${CONFIG}.pth 1 --eval bboxx
π Acknowledgement
This project builds upon several outstanding open-source projects. We gratefully acknowledge the following key contributions.
π Bibtex
If you find this work useful for your research or projects, please consider citing the following BibTeX entry.
@inproceedings{kim2025protoocc,
title={Protoocc: Accurate, efficient 3d occupancy prediction using dual branch encoder-prototype query decoder},
author={Kim, Jungho and Kang, Changwon and Lee, Dongyoung and Choi, Sehwan and Choi, Jun Won},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={4},
pages={4284--4292},
year={2025}
}