VOIC: Visible–Occluded Integrated Guidance for 3D Semantic Scene Completion
April 17, 2026 · View on GitHub
Zaidao Han, Risa Higashita, Jiang Liu
Paper
VOIC: Visible–Occluded Integrated Guidance for 3D Semantic Scene Completion https://arxiv.org/abs/2512.18954
🎬 Visualization
VOIC Result
https://github.com/user-attachments/assets/c99150af-eb98-44d1-9538-255268b58cea
Comparison Result
https://github.com/user-attachments/assets/b10e4376-74c1-4575-bf0a-b1e9039ea2eb
📢 News / Updates
- Coming Soon 🚀: The code and pre-trained models will be released immediately upon publication. Please star ⭐ this repository to receive notifications!
🏠 Abstract
Camera-based 3D Semantic Scene Completion (SSC) is a critical task for autonomous driving and robotic scene understanding, aiming to infer a complete 3D volumetric representation of both semantics and geometry from a single image.
Existing methods typically focus on end-to-end 2D-to-3D feature lifting and voxel completion, yet they often overlook the interference between high-confidence visible-region perception and low-confidence occluded-region reasoning.
To address these challenges, we introduce VOIC (Visible-Occluded Interactive Completion Network). Our contributions are:
- VRLE (Visible Region Label Extraction): An offline strategy that explicitly separates and extracts voxel-level supervision for visible regions from dense 3D ground truth.
- Dual-Decoder Framework: Explicitly decouples SSC into visible-region semantic perception (Visible Decoder) and occluded-region scene completion (Occlusion Decoder).
- SOTA Performance: VOIC outperforms existing monocular SSC methods on SemanticKITTI and SSCBench-KITTI360 benchmarks.
🏆 Results
VOIC achieves state-of-the-art performance on the SemanticKITTI hidden test set. For a detailed comparison with other methods, please refer to our full paper.
| Method | IoU (%) | mIoU (%) |
|---|---|---|
| VOIC (Ours) | 45.22 | 18.01 |
📅 TODO
- Release training and inference code.
- Release pre-trained models (SemanticKITTI & SSCBench).
- Provide VRLE preprocessing scripts.