README.md

October 20, 2025 · View on GitHub

ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

Junliang Ye^1,2*, Zhengyi Wang^1,2*, Ruowen Zhao^1*, Shenghao Xie³, Jun Zhu^1,2†
^*Equal Contribution.
^†Corresponding authors.
¹Tsinghua University, ²ShengShu, ³Peking University,

NeurIPS 2025 Spotlight 🔥

https://github.com/user-attachments/assets/f77bb981-15ef-4546-ae1a-9baf05dc8002

Release

[6/03] 🔥🔥We released the pretrained weights for both ShapeLLM-Omni (7B) and 3DVQVAE.
[6/03] 🔥🔥We released 50k high-quality 3D edited data pairs.
[6/07] 🔥🔥We built a demo for everyone to try out.

Installation

Please set up the Python environment following TRELLIS and QWEN2.5-vl, or you can create by:

pip install -r requirements.txt

Inference

We suggest using Gradio UI for visualizing inference.

python app.py

https://github.com/user-attachments/assets/edb2b828-b65c-40f6-88da-9f5094c40b2e

For templates used for different tasks, please refer to the templates.txt

Qualitative result

https://github.com/user-attachments/assets/79a33188-3ef0-4702-9892-15b864710f2d

https://github.com/user-attachments/assets/43b7bc78-1bef-4b79-bbdb-edfc4ad2b8e1

Important Notes

Please refer to our project_page for more examples.

Todo

Release of the entire 3D-Alpaca dataset.
Release of training code.
Release of model weights featuring multi-turn dialogue and 3D editing capabilities.

Acknowledgement

Our code is based on these wonderful repos:

Also, we invite you to explore our latest work — Nano3D, a training-free 3D editing algorithm without mask constraints. Based on this algorithm, we will soon release a higher-quality 3D editing dataset — 3D-Alpaca-Editing-v2 (Nano3D-Edit-100k) — as open source.

✍️ Citation

@article{ye2025shapellm,
  title={ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding},
  author={Ye, Junliang and Wang, Zhengyi and Zhao, Ruowen and Xie, Shenghao and Zhu, Jun},
  journal={arXiv preprint arXiv:2506.01853},
  year={2025}
}