3D LLM

March 23, 2026 ยท View on GitHub

2026
Title & Authors & LinksAreasTags
Preprint GitHub
OVGGT: O(1) Constant-Cost Streaming Visual Geometry Transformer
Si-Yu Lu, Po-Ting Chen, Hui-Che Hsu, Sin-Ye Jhong, Wen-Huang Cheng, Yung-Yao Chen
Area
Note
Cost Level
Op Mech
Target
PDF Preprint GitHub
HCC-3D: Hierarchical Compensatory Compression for 98% 3D Token Reduction in Vision-Language Models
Liheng Zhang, Jin Wang, Hui Li, Bingfeng Zhang, Weifeng Liu
AreaCost Level
Op Target
2025
Title & Authors & LinksAreasTags
PDF Preprint GitHub
Prune2Drive: A Plug-and-Play Framework for Accelerating Vision-Language Models in Autonomous Driving
Minhao Xiong, Zichen Wen, Zhuangcheng Gu, Xuyang Liu, Rui Zhang, Hengrui Kang, Jiabing Yang, Junyuan Zhang, Weijia Li, Conghui He, Yafei Wang, Linfeng Zhang
AreaCost Level
Op Mech
Target
PDF Preprint GitHub
Fast3D: Accelerating 3D Multi-modal Large Language Models for Efficient 3D Scene Understanding
Wencan Huang, Daizong Liu, Wei Hu
AreaCost Level
Op Mech
Target
PDF
Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression
Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai, Che-Chun Su, Lu Xia, Sanghun Jung, Cheng-Yen Yang, Jenq-Neng Hwang, Min Sun, Cheng-Hao Kuo
AreaCost Level
Op Mech
Target
PDF Preprint GitHub
AdaToken-3D: Dynamic Spatial Gating for Efficient 3D Large Multimodal-Models Reasoning
Kai Zhang, Xingyu Chen, Xiaofeng Zhang
AreaCost Level
Op Mech
Target
PDF Preprint GitHub
3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding
Haomiao Xiong, Yunzhi Zhuge, Jiawen Zhu, Lu Zhang, Huchuan Lu
AreaCost Level
Op Target