VOIC: Visible–Occluded Integrated Guidance for 3D Semantic Scene Completion

April 17, 2026 · View on GitHub

Zaidao Han, Risa Higashita, Jiang Liu

Paper

VOIC: Visible–Occluded Integrated Guidance for 3D Semantic Scene Completion https://arxiv.org/abs/2512.18954


🎬 Visualization

VOIC Result

https://github.com/user-attachments/assets/c99150af-eb98-44d1-9538-255268b58cea

Comparison Result

https://github.com/user-attachments/assets/b10e4376-74c1-4575-bf0a-b1e9039ea2eb


📢 News / Updates

  • Coming Soon 🚀: The code and pre-trained models will be released immediately upon publication. Please star ⭐ this repository to receive notifications!

🏠 Abstract

Camera-based 3D Semantic Scene Completion (SSC) is a critical task for autonomous driving and robotic scene understanding, aiming to infer a complete 3D volumetric representation of both semantics and geometry from a single image.

Existing methods typically focus on end-to-end 2D-to-3D feature lifting and voxel completion, yet they often overlook the interference between high-confidence visible-region perception and low-confidence occluded-region reasoning.

To address these challenges, we introduce VOIC (Visible-Occluded Interactive Completion Network). Our contributions are:

  • VRLE (Visible Region Label Extraction): An offline strategy that explicitly separates and extracts voxel-level supervision for visible regions from dense 3D ground truth.
  • Dual-Decoder Framework: Explicitly decouples SSC into visible-region semantic perception (Visible Decoder) and occluded-region scene completion (Occlusion Decoder).
  • SOTA Performance: VOIC outperforms existing monocular SSC methods on SemanticKITTI and SSCBench-KITTI360 benchmarks.

🏆 Results

VOIC achieves state-of-the-art performance on the SemanticKITTI hidden test set. For a detailed comparison with other methods, please refer to our full paper.

MethodIoU (%)mIoU (%)
VOIC (Ours)45.2218.01

📅 TODO

  • Release training and inference code.
  • Release pre-trained models (SemanticKITTI & SSCBench).
  • Provide VRLE preprocessing scripts.