OneVAE: Unified Repository for Continuous and Discrete VAE Training
September 21, 2025 ยท View on GitHub
Also the official open-source implementation of our work OneVAE.
๐ Paper: OneVAE: Joint Discrete and Continuous Optimization Helps Discrete VAE Train Better
Key Contributions:
- Multiple Structural Improvements โ Introduces several architecture-level enhancements for discrete VAE to boost reconstruction quality under high compression.
- Progressive Training with Pretrained Continuous VAE โ Initializes from a high-quality pretrained continuous VAE and gradually transitions to discrete VAE, effectively leveraging strong priors.
- Unified Model โ Achieves superior performance on both continuous and discrete representations within a single model.
Development Status
In addition to releasing the code of this work, we aim to provide a unified repository that supports fine-tuning and training of multiple pretrained VAE models, enabling the community to better adapt VAEs to their specific needs.
We are actively organizing and refining the codebase, and โก most features and resources will be released within two weeks!
Open Source Model
| Model Name | Encoding Method | Compression Ratio | Download Link |
|---|---|---|---|
| OneVAE | Discrete, Multi-Token Quant = 2 | 8 x 16 x 16 | Link |
| OneVAE | Discrete, Multi-Token Quant = 2 | 16 x 16 x 16 | Link |
| OneVAE | Discrete, Multi-Token Quant = 2 | 8 x 8 x 8 | Link |
Visual Results
Video Gallery
| Video1 | Video2 |
|---|---|
More Discrete Video Results on High-Compression VAE (4ร16ร16)
| Video1 | Video2 | Video3 |
|---|---|---|
Planned Supported Fine-Tuning
Image VAE
- FluxVAE
- LlamaGen
- SD-VAE
Video VAE
- OneVAE (ours)
- WanVAE (Alibaba)
- HunyuanVideo VAE (Tencent)
TODO
- Release model code (to be completed within two weeks)
- Provide pretrained weights download links
- Support additional types of VAE models
LICENSE
The code is licensed under the Apache License 2.0. When using our repository to fine-tune other models, you must comply with the licenses of the respective pretrained models.