๐ณ-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability
March 12, 2026 ยท View on GitHub
Yu Yang1,2, Alan Liang2, Jianbiao Mei1, Yukai Ma1, Yong Liu1, Gim Hee Lee2
1 Zhejiang University 2 National University of Singapore
๐ข News
-
[2025-09-19]Our ๐ณ-Scene is accepted by NeurIPS 2025! -
[2025-06-18]We released our project website here. -
[2025-06-16]The paper can be accessed at arxiv.
๐ฏ Abstract
Overview of ๐ณ-Scene. a unified world generator that supports multi-granular controllability through high-level text-to-layout generation and low-level BEV layout conditioning. It performs joint occupancy, image, and video generation for 3D scene synthesis and reconstruction with high fidelity.
๐ Getting Started
Please refer to the following documents to set up the environment and run ๐ณ-Scene:
- ๐ ๏ธ Installation Guide
- ๐ Dataset Preparation
- โก Train and Evaluation
๐ฏ Roadmap
- Paper & Project Page
- Release the Training Code
- Release the Inference Code
- Release the Processed Data
๐ฅ Demo of Layout-to-Scene Generation
๐ค Acknowledgments
We are grateful for the following open-source projects that inspired or assisted the development of ๐ณ-Scene:
| Occupancy Generation | Video & Driving Synthesis |
|---|---|
| SemCity | MagicDrive |
| DynamicCity | DriveArena |
| OccSora | LiDARCrafter |
| UniScene | X-Drive |
Special thanks to these communities for their incredible contributions to the field!
๐ Citation
@article{yang2025xscene,
title={X-Scene: Large-Scale Driving Scene Generation with High Fidelity and Flexible Controllability},
author={Yang, Yu and Liang, Alan and Mei, Jianbiao and Ma, Yukai and Liu, Yong and Lee, Gim Hee},
journal={arXiv preprint arXiv:2506.13558},
year={2025}
}