OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer
March 9, 2026 · View on GitHub
Si-Yu Lu1, Po-Ting Chen2, Hui-Che Hsu2, Sin-Ye Jhong2, Wen-Huang Cheng1, Yung-Yao Chen2
1National Taiwan University 2National Taiwan University of Science and Technology
TL;DR: OVGGT is a training-free framework enabling streaming 3D reconstruction from arbitrarily long video with constant memory and compute — achieving O(1) per-frame cost while surpassing full-cache baselines in accuracy.
Left: Quantitative comparison on 7-Scenes across 200 frames. Right: Qualitative 3D reconstructions demonstrating OVGGT's stability over long sequences (50–500 frames).
News
Overview
OVGGT is a training-free framework that enables streaming 3D reconstruction from arbitrarily long video with constant memory and compute. It combines Self-Selective Caching (SSC) for zero-overhead KV cache compression via FFN residual magnitudes, and Dynamic Anchor Protection (DAP) to shield geometrically critical tokens from eviction, suppressing coordinate drift over long sequences. OVGGT is fully compatible with FlashAttention and processes videos within a fixed VRAM envelope while surpassing full-cache baselines in accuracy.
⚙️ Installation
- Clone OVGGT
git clone https://github.com/<your-username>/OVGGT.git
cd OVGGT
- Create conda environment
conda create -n OVGGT python=3.11 cmake=3.14.0
conda activate OVGGT
- Install requirements
pip install -r requirements.txt
conda install 'llvm-openmp<16'
Download Checkpoints
Please download checkpoint of StreamVGGT from Hugging Face or Tsinghua cloud.
Evaluation
The evaluation code follows MonST3R, CUT3R, TTT3R, StreamVGGT and InfiniteVGGT.
cd src/
Multi-view Reconstruction
bash eval/mv_recon/run.sh
Results will be saved in eval_results/mv_recon/${model_name}_${ckpt_name}/logs_all.txt.
Video Depth
bash eval/video_depth/run.sh
Results will be saved in eval_results/video_depth/${data}_${model_name}/result_scale.json.
Pose Evaluation
bash eval/pose_evaluation/run.sh
Results will be saved in eval_results/pose_evaluation/{data}_${model_name}/_error_log.txt.
🚀 Quick Start
Viser Demo (Interactive 3D Visualization)
We provide a demo for OVGGT, based on the demo code from InfiniteVGGT. You can follow the instructions below to launch it.
python demo_viser.py \
--seq_path path/to/nrgbd/image_sequence \
--frame_interval 10 \
--gt_path path/to/nrgbd/gt_camera (Optional)
Gradio Demo (Web UI)
We provide a demo for OVGGT, based on the demo code from VGGT. You can follow the instructions below to launch it.
pip install -r requirements_demo.txt
python demo_gradio.py
🙏 Acknowledgements
Our code is based on the following brilliant repositories:
DUSt3R MonST3R Spann3R CUT3R VGGT Point3R StreamVGGT TTT3R Evict3R InfiniteVGGT
Many thanks to these authors!
📝 Citation
@article{lu2026ovggt,
title={OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer},
author={Si-Yu Lu and Po-Ting Chen and Hui-Che Hsu and Sin-Ye Jhong and Wen-Huang Cheng and Yung-Yao Chen},
journal={arXiv preprint arXiv:2603.05959},
year={2026}
}