DFloat11 + OmniGen2

June 25, 2025 ยท View on GitHub

DFloat11

project page transformer model mllm model

DFloat11 + OmniGen2

This is a DFloat11 losslessly compressed version of the original OmniGen2 model. It reduces model size by 32% compared to the original BFloat16 model, while maintaining bit-identical outputs and supporting efficient GPU inference.

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ Thanks to DFloat11 compression, OmniGen2 can now run smoothly on a single 16GB GPU without any quality loss. ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

We apply Huffman coding to losslessly compress the exponent bits of BFloat16 model weights, which are highly compressible (their 8 bits carry only ~2.6 bits of actual information). To enable fast inference, we implement a highly efficient CUDA kernel that performs on-the-fly weight decompression directly on the GPU.

The result is a model that is ~32% smaller, delivers bit-identical outputs, and achieves performance comparable to the original BFloat16 model.

Learn more in our research paper.

๐Ÿ“Š Performance Comparison

MetricOmniGen2 (BFloat16)OmniGen2 (DFloat11)
Model Size16.23 GB11.11 GB
Peak GPU Memory
(1024ร—1024 image generation)
18.41 GB14.36 GB
Generation Time
(A100 GPU)
25 seconds27 seconds

๐Ÿš€ Quick Start

Requires a CUDA-compatible GPU with at least 16GB of VRAM.

๐Ÿ› ๏ธ Environment Setup

# 1. Clone the repo
git clone https://github.com/LeanModels/OmniGen2-DFloat11.git
cd OmniGen2-DFloat11

# 2. (Optional) Create a clean Python environment
conda create -n omnigen2 python=3.11
conda activate omnigen2

# 3. Install dependencies
# 3.1 Install PyTorch (choose correct CUDA version)
pip install torch==2.6.0 torchvision --extra-index-url https://download.pytorch.org/whl/cu124

# 3.2 Install other required packages
pip install -r requirements.txt

# Note: Version 2.7.4.post1 is specified for compatibility with CUDA 12.4.
# Feel free to use a newer version if you use CUDA 12.6 or they fixed this compatibility issue.
# OmniGen2 runs even without flash-attn, though we recommend install it for best performance.
pip install flash-attn==2.7.4.post1 --no-build-isolation

๐ŸŒ For users in Mainland China

# Install PyTorch from a domestic mirror
pip install torch==2.6.0 torchvision --index-url https://mirror.sjtu.edu.cn/pytorch-wheels/cu124

# Install other dependencies from Tsinghua mirror
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

# Note: Version 2.7.4.post1 is specified for compatibility with CUDA 12.4.
# Feel free to use a newer version if you use CUDA 12.6 or they fixed this compatibility issue.
# OmniGen2 runs even without flash-attn, though we recommend install it for best performance.
pip install flash-attn==2.7.4.post1 --no-build-isolation -i https://pypi.tuna.tsinghua.edu.cn/simple

๐Ÿงช Run Examples

The following examples will automatically download the DFloat11 OmniGen2 model, and use the GPU to generate/edit images or generate text.

# Visual Understanding
bash example_understanding.sh

# Text-to-image generation
bash example_t2i.sh

# Instruction-guided image editing
bash example_edit.sh

# In-context generation
bash example_in_context_generation.sh

Gradio Demo:

# for only generating image
pip install gradio
python app.py
# Optional: Share demo with public link (You need to be able to access huggingface)
python app.py --share

# for generating image or text
pip install gradio
python app_chat.py

Learn More About DFloat11

OmniGen2 Introduction

OmniGen2 is a powerful and efficient generative model. Unlike OmniGen v1, OmniGen2 features two distinct decoding pathways for text and image modalities, utilizing unshared parameters and a decoupled image tokenizer. OmniGen2 has competitive performance across four primary capabilities:

  • Visual Understanding: Inherits the robust ability to interpret and analyze image content from its Qwen-VL-2.5 foundation.
  • Text-to-Image Generation: Creates high-fidelity and aesthetically pleasing images from textual prompts.
  • Instruction-guided Image Editing: Executes complex, instruction-based image modifications with high precision, achieving state-of-the-art performance among open-source models.
  • In-context Generation: A versatile capability to process and flexibly combine diverse inputsโ€”including humans, reference objects, and scenesโ€”to produce novel and coherent visual outputs.

As an open-source project, OmniGen2 provides a powerful yet resource-efficient foundation for researchers and developers exploring the frontiers of controllable and personalized generative AI.


Demonstrations.


Demonstration of OmniGen2's image editing capabilities.


Demonstration of OmniGen2's in-context generation capabilities.