README.md

January 23, 2026 ยท View on GitHub

Skywork-UniPic

Unified multimodal model for image editing, generation, and understanding

๐Ÿ“ Overview

Welcome to the Skywork-UniPic repository!
This repository hosts the model weights and official implementations of unipic unified multimodal series, featuring three distinct modeling paradigms:

  • UniPic-3 (README) โ€” ๐Ÿ”ฅ Open-source SOTA Multi-Image Editing Model. Unified framework for single-image editing & multi-image composition. Supports 1โ€“6 input images with flexible resolutions. 8-steps inference with 12.5ร— speedup via CM + DMD distillation.

    UniPic-3 Teaser
  • UniPic-2(README) โ€” SD3.5M-Kontext and MetaQuery variants based on Efficient Architectures with Diffusion Post-Training, delivering state-of-the-art performance in text-to-image generation, fine-grained image editing, and multimodal reasoning.

    UniPic-2 Teaser
  • UniPic-1(README) โ€” 1.5B parameters, Unified Autoregressive Modeling for joint visual understanding and generation, enabling a single transformer to handle both perception and synthesis tasks.


๐Ÿ”ฅ Latest News

DateUpdate
2026-01-09Released UniPic-3 โ€” ๐Ÿ”ฅ Open-source SOTA multi-image editing model. Support single & multi-image editing, 1โ€“6 inputs, 8-step / 12.5ร— faster inference
GitHub HuggingFace arXiv
2025-08-13Released UniPic-2 โ€” Unified Model Weights with Diffusion-based Post-Training
GitHub HuggingFace arXiv
2025-07-30Released UniPic-1 โ€” Autoregressive unified modeling from scratch
GitHub HuggingFace arXiv

โœจ Key Features

  • ๐ŸŽจ Text-to-Image Generation โ€” High-fidelity synthesis from natural language prompts.
  • ๐Ÿ›  Image Editing โ€” Seamless inpainting, outpainting, and object manipulation.
  • ๐Ÿ–ผ Image Understanding โ€” Robust perception capabilities for various visual tasks.
  • โšก Efficient Architecture โ€” Optimized for both accuracy and deployability.

๐Ÿ“œ License

This project is licensed under the MIT License โ€” see the LICENSE file for details.