README.md
June 7, 2026 · View on GitHub
LegoOcc: Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes
CVPR 2026 Oral
Changqing Zhou1, Yueru Luo2, Han Zhang1, Zeyu Jiang1, Changhao Chen1 ✉
1The Hong Kong University of Science and Technology (Guangzhou)
2The Chinese University of Hong Kong, Shenzhen
✉ Corresponding author.
LegoOcc tackles monocular open-vocabulary 3D semantic occupancy prediction in large-scale indoor scenes under geometry-only supervision. It represents scenes as Language-Embedded Gaussians, introduces an opacity-aware Poisson Gaussian-to-Occupancy operator for stable volumetric aggregation, and adopts Progressive Temperature Decay to strengthen Gaussian-language alignment during training.
The framework is designed for open-vocabulary indoor occupancy reasoning from monocular observations, bridging sparse Gaussian scene modeling and language-aware 3D occupancy prediction.
News
- [2026.05] :rocket: Code is released
- [2026.04] :microphone: LegoOcc was accepted to CVPR 2026 (Oral).
Documentation
For setup and usage details, please refer to the documents under docs/:
docs/install.md: environment setup, dependency installation, and pretrained component preparation.docs/data.md: dataset preparation for OccScanNet, folder structure, and symbolic link setup.docs/train_eval.md: training workflow, configuration notes, and runtime environment variables.
Getting Started
- Follow
docs/install.mdto prepare the environment. - Follow
docs/data.mdto organize the dataset. - Follow
docs/train_eval.mdto launch training.
Citation
If you find this work useful, please consider citing:
@inproceedings{zhou2026monocular,
title={Monocular open vocabulary occupancy prediction for indoor scenes},
author={Zhou, Changqing and Luo, Yueru and Zhang, Han and Jiang, Zeyu and Chen, Changhao},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={21627--21637},
year={2026}
}
Related Projects
We recommend checking out the following related projects: