VoxSAMNet

March 8, 2026 · View on GitHub

Official repository for our CVPR paper on monocular 3D semantic scene completion.

Status: Code and pretrained models are currently being organized and will be released soon.

News

[2026-02] VoxSAMNet is accepted by CVPR 2026.
[2026-03] Repository initialized.
[Coming Soon] Training code and pretrained checkpoints.

Overview

VoxSAMNet is a framework for monocular 3D semantic scene completion.
It aims to reconstruct a complete voxel-wise semantic scene from a single RGB image.

Method Overview

Our method introduces key modules to improve 3D semantic scene completion performance under monocular input settings.

Qualitative Results

VoxSAMNet produces more complete and semantically consistent scene predictions compared with previous methods.

Release Plan

Repository initialization
Training code
Pretrained checkpoints
Data preprocessing instructions

Citation

Coming soon.

Contact

If you have any questions, please feel free to open an issue or contact the authors.