TT-Occ: Test-Time Compute for Self-Supervised Occupancy

June 1, 2026 · View on GitHub

Under Construction

demo Why train a dedicated occupancy network in the era of foundation VLMs? We show that a test-time occupancy framework that integreted with a conmbination of VLMs could achieve SOTA performance without any network training or fine-tuning.

TT-Occ easily integrates advanced VLMs through a simple adapter interface. Contributions are warmly welcomed!

📥 Data Preparation

Download the nuScenes dataset from nuScenes.org.

Extract nuScenes data into a readable format:

python extract_nuscenes.py  # Ensure you set your dataset path inside this script.

Download ground-truth occupancy labels:
- Occ3D-nuScenes GT: Google Drive
- nuCraft GT: nuCraft GitHub
Update data_root of all scripts in ``.

🌱 Environment Setup

For the external repositories used in this project, we provide minimal versions of their codebases under the submodules directory (with their original licenses retained; please respect and comply with their terms). These have been packaged under a unified conda environment, so you do not need to clone each dependency separately.

To reproduce our environment reliably, use:

conda env create -f environment.yaml
conda activate ttocc
bash submodules/install_and_download.sh

install_and_download.sh (run from the repo root, with ttocc activated) includes all install and checkpoint download steps required by TT-Occ (OpenSeeD / Rex-Omni / VGGT / MapAnything / RAFT / 3DGS CUDA extensions).

✅ Currently Supported Online Models

TT-Occ currently supports the following online test-time providers:

Semantic models: openseed, rexomni
Depth models: vggt, mapanything
Dynamic mask model: raft

Selection is controlled by environment variables in run_main.sh:

SEMANTIC_PREFIX (openseed by default)
DEPTH_PREFIX (vggt by default)
DYNAMIC_MASK_PREFIX (raft by default)

🚀 Running TT-Occ

Evaluate on the complete 150-scene test split:

conda activate ttocc
bash run_main.sh

By default run_main.sh enables mIoU (EVAL_OCC=1) and writes per-scene result.json under out-main-Occ3D/<variant>/<scene>/.

Manual flags for train.py:

--use_fusion — enable E-style semantic/radiometric fusion (default off, D-equivalent)
--eval_occ --occ3d_path ... --nucraft_path ... — mIoU vs saved Occ/*.pth

Offline aggregation (same logic as train): python summarymiou.py, python summary.py.

🎨 Visualization

We provide a simple occupancy visualizer based on Open3D. To visualize occupancy predictions, run:

python vis.py  # Make sure the dataset path is correctly set.

Example visualization outputs:

TT-OccLiDAR:
TT-OccCamera:

For advanced visualization commands, refer to custom_utils/VoxelGridVisualizer.

📌 Acknowledgements

This project builds upon the excellent codebase of 3DGS and powerful VLMs including OpenSeeD, Rex-Omni, VGGT, MapAnything, RAFT, and TT-Occ. We deeply appreciate their creators' efforts and your interest in TT-Occ!

📖 Citation

If you find this work helpful, please star our repo and cite the paper:

@InProceedings{ttocc,
    author    = {Zhang, Fengyi and Sun, Xiangyu and Yang, Huitong and Zhang, Zheng and Huang, Zi and Luo, Yadan},
    title     = {Test-Time 3D Occupancy Prediction},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2026},
    pages     = {35691-35701}
}