TT-Occ: Test-Time Compute for Self-Supervised Occupancy

June 1, 2026 ยท View on GitHub

Under Construction

demo Why train a dedicated occupancy network in the era of foundation VLMs? We show that a test-time occupancy framework that integreted with a conmbination of VLMs could achieve SOTA performance without any network training or fine-tuning.

TT-Occ easily integrates advanced VLMs through a simple adapter interface. Contributions are warmly welcomed!

๐Ÿ“ฅ Data Preparation

  1. Download the nuScenes dataset from nuScenes.org.
  2. Extract nuScenes data into a readable format:
    python extract_nuscenes.py  # Ensure you set your dataset path inside this script.
    
  3. Download ground-truth occupancy labels:
  4. Update data_root of all scripts in ``.

๐ŸŒฑ Environment Setup

For the external repositories used in this project, we provide minimal versions of their codebases under the submodules directory (with their original licenses retained; please respect and comply with their terms). These have been packaged under a unified conda environment, so you do not need to clone each dependency separately.

To reproduce our environment reliably, use:

conda env create -f environment.yaml
conda activate ttocc
bash submodules/install_and_download.sh

install_and_download.sh (run from the repo root, with ttocc activated) includes all install and checkpoint download steps required by TT-Occ (OpenSeeD / Rex-Omni / VGGT / MapAnything / RAFT / 3DGS CUDA extensions).

โœ… Currently Supported Online Models

TT-Occ currently supports the following online test-time providers:

  • Semantic models: openseed, rexomni
  • Depth models: vggt, mapanything
  • Dynamic mask model: raft

Selection is controlled by environment variables in run_main.sh:

  • SEMANTIC_PREFIX (openseed by default)
  • DEPTH_PREFIX (vggt by default)
  • DYNAMIC_MASK_PREFIX (raft by default)

๐Ÿš€ Running TT-Occ

Evaluate on the complete 150-scene test split:

conda activate ttocc
bash run_main.sh

By default run_main.sh enables mIoU (EVAL_OCC=1) and writes per-scene result.json under out-main-Occ3D/<variant>/<scene>/.

Manual flags for train.py:

  • --use_fusion โ€” enable E-style semantic/radiometric fusion (default off, D-equivalent)
  • --eval_occ --occ3d_path ... --nucraft_path ... โ€” mIoU vs saved Occ/*.pth

Offline aggregation (same logic as train): python summarymiou.py, python summary.py.

๐ŸŽจ Visualization

We provide a simple occupancy visualizer based on Open3D. To visualize occupancy predictions, run:

python vis.py  # Make sure the dataset path is correctly set.

Example visualization outputs:

  • TT-OccLiDAR: TT-OccLiDAR

  • TT-OccCamera: TT-OccCamera

For advanced visualization commands, refer to custom_utils/VoxelGridVisualizer.

๐Ÿ“Œ Acknowledgements

This project builds upon the excellent codebase of 3DGS and powerful VLMs including OpenSeeD, Rex-Omni, VGGT, MapAnything, RAFT, and TT-Occ. We deeply appreciate their creators' efforts and your interest in TT-Occ!

๐Ÿ“– Citation

If you find this work helpful, please star our repo and cite the paper:

@InProceedings{ttocc,
    author    = {Zhang, Fengyi and Sun, Xiangyu and Yang, Huitong and Zhang, Zheng and Huang, Zi and Luo, Yadan},
    title     = {Test-Time 3D Occupancy Prediction},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2026},
    pages     = {35691-35701}
}