ViewSRD-3D-Visual-Grounding-via-Structured-Multi-View-Decomposition
July 27, 2025 ยท View on GitHub
๐ ViewSRD: 3D Visual Grounding via Structured Multi-View Decomposition ๐
Ever tried telling a robot "the bookshelf left of the sofa but behind the lamp"?
Existing models might panic. We fix that with ViewSRD, a framework that decodes complex spatial descriptions in 3D scenes like magic โจ