ViPE: Video Pose Engine for Geometric 3D Perception

June 9, 2026 · View on GitHub

teaser

TL;DR: ViPE is a useful open-source spatial AI tool for annotating camera poses and dense depth maps from raw videos!

ViPE estimates camera intrinsics, camera motion, and dense near-metric depth maps from unconstrained raw videos, including pinhole, wide-angle, and 360-degree panorama footage.

News

2026/06: 🚀🚀🚀 Released ViPE 1.2.0: 2.7x speed-up with no loss of accuracy, enabled by CUDA fused kernels, model and pipeline caching, prefetching, and other optimizations.
2026/05: Merged Panorama estimation pipeline & bump release version to 1.0.0.
2026/01: Integration with Depth-Anything 3 for depth estimation (use dav3 pipeline).
2025/10: Add support to run on wide-angle videos.
2025/09: Add support to run Lyra pipeline.
2025/08: Initial release of ViPE.

This project will download and install additional third-party models and softwares. Note that these models or softwares are not distributed by NVIDIA. Review the license terms of these models and projects before use. This source code, except for the Unik3D part (which is under the BY-NC-SA 4.0 license) , is released under the Apache 2 License.