DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing
June 12, 2026 ยท View on GitHub
๐ฟ Overview
We introduce DreamLite, a compact and unified on-device diffusion model (0.39B) that seamlessly supports both text-to-image generation and text-guided image editing within a single network architecture.
Built upon a pruned mobile U-Net backbone, DreamLite unifies multimodal conditioning through In-Context Spatial Concatenation directly in the latent space. By leveraging progressive step distillation, DreamLite achieves ultra-fast 4-step inference, capable of generating or editing a 1024ร1024 image in ~3 seconds on an iPhone 17 Pro (powered by 4-bit Qwen-VL and fp16 VAE+UNet) โ operating fully on-device with zero cloud dependency.
Figure 1. The overall unified architecture of DreamLite.
๐ฐ News
- [2026.06] ๐๐๐ DreamLite has been merged into the official ๐ค Diffusers library (pending the next release). You can now load and run DreamLite directly via
diffusers. - [2026.06] ๐๐๐ Community contribution: @ENUMERA8OR released dreamlite-comfyui-lowvram โ ComfyUI custom nodes and a low-VRAM inference pipeline that runs DreamLite-base at 1024ร1024 on 4 GB GPUs.
- [2026.04] ๐๐๐ We officially released the inference code.
- [2026.03] ๐๐๐ DreamLite is publicly announced! Check out our project page and arXiv paper.
๐ฌ On-Device Demo
Experience real-time generation and editing on an iPhone 17 Pro. No internet connection or cloud processing required.
| Human Portrait & Style Transfer | Nature Landscape & Background Swap | Product & Object Replacement |
|---|---|---|
Note: If demos fail to render natively on GitHub, please visit our Project Page to watch the full demonstrations.
โ๏ธ Getting Started
1. Environment Setup
# Clone the repository
git clone https://github.com/ByteVisionLab/DreamLite.git
cd DreamLite
# Create and activate a conda environment
conda create -n dreamlite python=3.10 -y
conda activate dreamlite
# Install dependencies
pip install -r requirements.txt
Ensure the model weights (DreamLite-base and DreamLite-mobile) are placed in the following directory structure:
DreamLite/
โโโ models/
โ โโโ DreamLite-base/
โ โโโ DreamLite-mobile/
2. Inference via ๐ค Diffusers
DreamLite has been merged into the official ๐ค Diffusers library. Since the change is not yet included in a stable Diffusers release, please install the latest main branch directly from source:
pip install git+https://github.com/huggingface/diffusers.git
Model weights are hosted on the diffusers branch of the following Hugging Face repos:
carlofkl/DreamLite-base(28-step, high fidelity)carlofkl/DreamLite-mobile(4-step, ultra fast)
Access is currently gated โ please first request access via the Access Request Form. Once approved, you will receive a Hugging Face access token from us. Use this token to login locally:
huggingface-cli login # paste the token we sent you when prompted
# or non-interactively:
huggingface-cli login --token <TOKEN>
from_pretrained(..., revision="diffusers") will then automatically download the weights on first use.
Alternatively, you can pre-download the weights with the CLI using the same token:
hf download carlofkl/DreamLite-base --revision diffusers --local-dir models/DreamLite-base --token <TOKEN>
hf download carlofkl/DreamLite-mobile --revision diffusers --local-dir models/DreamLite-mobile --token <TOKEN>
and then load from the local path: DreamLitePipeline.from_pretrained("models/DreamLite-base", torch_dtype=dtype).
Then you can load and run DreamLite with just a few lines of code:
import torch
from diffusers import DreamLitePipeline
from diffusers.utils import load_image
model_id = "carlofkl/DreamLite-base"
device = "cuda"
dtype = torch.float16
pipe = DreamLitePipeline.from_pretrained(model_id, revision="diffusers", torch_dtype=dtype)
pipe.to(device=device)
# Text-to-image
image = pipe(
prompt="A serene mountain lake at sunrise",
generator=torch.Generator(device=device).manual_seed(42),
).images[0]
image.save("dreamlite_t2i.png")
# Image-to-image (instruction-based edit)
image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/astronaut.jpg"
init_image = load_image(image_url)
edited = pipe(
prompt="make it snowy",
image=init_image,
generator=torch.Generator(device=device).manual_seed(42),
).images[0]
edited.save("dreamlite_i2i.png")
3. Inference via CLI
You can readily generate or edit images utilizing our provided command-line interfaces.
# ==========================================
# DreamLite-base: 28 Steps (High Fidelity)
# ==========================================
# Text-to-Image Generation
python infer.py --prompt "A close-up of a fire spitting dragon cinematic shot."
# Text-guided Image Editing
python infer.py --prompt "Transfer this image to oil-painting style." --image_path ./inputs/source.png
# ==========================================
# DreamLite-mobile: 4 Steps (Ultra Fast)
# ==========================================
# Text-to-Image Generation
python infer_mobile.py --prompt "A portrait of a young woman with flowers."
# Text-guided Image Editing
python infer_mobile.py --prompt "Change the background to a dense forest." --image_path ./inputs/source.png
4. Benchmark Evaluation
We provide comprehensive benchmark evaluation scripts (GenEval & ImgEdit) to facilitate performance comparisons between DreamLite and other state-of-the-art models. Please configure your local dataset paths within tools/benchmark/infer_geneval.py and tools/benchmark/infer_imgedit.py prior to execution.
# Run the benchmark evaluation
python tools/benchmark/infer_geneval.py --save_dir ./output/benchmark/geneval_output --geneval_json "YOUR_GENEVAL/evaluation_metadata.jsonl"
python tools/benchmark/infer_imgedit.py --save_dir ./output/benchmark/imgedit_output --json_path "YOUR_IMGEDIT_PATH/ImgEdit/Benchmark/Basic/basic_edit.json" --img_root "YOUR_IMGEDIT_IMAGES_PATH/ImgEdit/Benchmark/singleturn"
5. Interactive Gradio Demo
We provide a user-friendly web interface powered by Gradio. You can try our live demo on Hugging Face Spaces, or deploy it locally on your own machine (GPU/CPU).
To run the interactive demo locally:
# Launch the local web server
python tools/app.py
๐ค Checkpoints
We offer two distinct variants of the DreamLite model to provide an optimal balance between visual fidelity and on-device inference latency.
Note
Model Access: Model weights are currently undergoing safety review. To request access, please fill out our Access Request Form.
โ ๏ธ Important Usage and Compliance Notice: By accessing and using these models, you agree to abide by our ethical guidelines. These models are for non-commercial, research-only use. You must NOT use them to generate, edit, or distribute any content that is sexually explicit, pornographic, violent, discriminatory, or otherwise illegal. Commercial use and public redistribution of the model weights are strictly prohibited.
| Model Variant | UNet Params | Resolution | Steps | Guidance |
|---|---|---|---|---|
| DreamLite (Base) | 0.39B | 1024ร1024 | 28 | CFG & IMG_CFG |
| DreamLite (Mobile) | 0.39B | 1024ร1024 | 4 | No CFG |
๐ Main Results
Quantitative comparison with state-of-the-art methods on generation and editing benchmarks.
Text-to-Image generation comparison.
Text-guided image editing comparison.
| Method | Params | GenEval โ | DPG โ | ImgEdit โ | GEdit-EN-Q โ |
|---|---|---|---|---|---|
| FLUX.1-Dev / Kontext | 12B | 0.67 | 84.0 | 3.76 | 6.79 |
| BAGEL | 7B | 0.82 | 85.1 | 3.42 | 7.20 |
| OmniGen2 | 4B | 0.80 | 83.6 | 3.44 | 6.79 |
| LongCat-Image / Edit | 6B | 0.87 | 86.6 | 4.49 | 7.55 |
| DeepGen1.0 | 2B | 0.83 | 84.6 | 4.03 | 7.54 |
| SANA-1.6B | 1.6B | 0.67 | 84.8 | - | - |
| SANA-0.6B | 0.6B | 0.64 | 83.6 | - | - |
| SnapGen++ (small) | 0.4B | 0.66 | 85.2 | - | - |
| VIBE | 1.6B | - | - | 3.85 | 7.28 |
| EditMGT | 0.96B | - | - | 2.89 | 6.33 |
| DreamLite (Ours) | 0.39B | 0.72 | 85.8 | 4.11 | 6.88 |
๐๏ธ LoRA Fine-tuning
We provide comprehensive support for LoRA fine-tuning and inference, enabling lightweight customization of DreamLite on your own domain-specific datasets.
For detailed instructions, training scripts, and examples, please refer to our dedicated LoRA Fine-Tuning Guide.
๐๏ธ On-device Deployment
We provide a complete iOS On-device Deployment Reference, including model export scripts (CoreML + mlx-vlm 4-bit quantization), modified Swift library files, and iOS app source code.
๐ Community Projects
We sincerely thank the community for extending DreamLite to broader use cases and hardware. If you have built something on top of DreamLite, feel free to open a PR/Issue and we'd be happy to feature it here.
- dreamlite-comfyui-lowvram by @ENUMERA8OR โ ComfyUI custom nodes and a low-VRAM inference pipeline that runs DreamLite-base at 1024ร1024 on a 4 GB NVIDIA GPU (e.g., GTX 1650 Ti). It enables this through sequential CPU offload, float32 precision, and a GQA-aware query-token attention slicing strategy that preserves DreamLite's grouped-query attention layout (which standard Diffusers attention slicing would break). The repository also bundles ComfyUI workflows for both DreamLite-base and DreamLite-mobile, plus tiled RealESRGAN upscaling support.
๐ Open-Source Plan
- Release paper on arXiv
- Release inference code
- Release LoRA training
- Release model weights on HuggingFace
- Release online demo
- On-device Deployment Reference
๐ Acknowledgement
We thank the great work from SDXL, SnapGen, Qwen and TAESDXL. The work is under supervision from Prof. Wangmeng Zuo.
๐ชช License
Code: Apache-2.0
Model weights: see WEIGHTS_LICENSE, CC BY-NC 4.0
๐ Citation
If our work assists your research, feel free to give us a star โญ or cite us using:
@article{feng2026dreamlite,
title={DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing},
author={Kailai Feng and Yuxiang Wei and Bo Chen and Yang Pan and Hu Ye and Songwei Liu and Chenqian Yan and Yuan Gao},
journal={arXiv preprint arXiv:2603.28713},
year={2026}
}