README.md

June 7, 2026 · View on GitHub

GPOcc: Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction

CVPR 2026 arXiv

Changqing Zhou1, Yueru Luo2, Changhao Chen1 ✉

1The Hong Kong University of Science and Technology (Guangzhou)
2The Chinese University of Hong Kong, Shenzhen

✉ Corresponding author.

Project Page | Paper

GPOcc leverages generalizable visual geometry priors, such as VGGT, and represents volumetric evidence as sparse 3D Gaussians for efficient monocular 3D occupancy prediction. It further supports streaming embodied perception with an incremental fusion strategy for online scene understanding.

News

  • [2026.05] Code is released.
  • [2026.02] :tada: GPOcc was accepted to CVPR 2026.

Overview

GPOcc generalizes powerful visual geometry priors to sparse Gaussian occupancy prediction. The core idea is to lift monocular observations into sparse 3D Gaussian scene elements and aggregate them into occupancy-aware scene representations for downstream prediction.

Getting Started

  1. Follow docs/install.md to prepare the environment.
  2. Follow docs/data.md to organize datasets.
  3. Follow docs/train_eval.md to launch training and evaluation.

Demos

Citation

If you find this work useful, please consider citing:

@inproceedings{zhou2026generalizing,
  title={Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction},
  author={Zhou, Changqing and Luo, Yueru and Chen, Changhao},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={28578--28587},
  year={2026}
}

We recommend checking out the following related projects: