README.md
March 11, 2026 · View on GitHub
OMGSR: You Only Need One Mid-timestep Guidance for Real-World Image Super-Resolution
Zhiqiang Wu1,2* |
Zhaomang Sun2 |
Tong Zhou2 |
Bingtao Fu2 |
Ji Cong2 |
Yitong Dong2 |
Huaqi Zhang2 |
Xuan Tang1 |
Mingsong Chen1 |
Xian Wei1†
1Software Engineering Institute, East China Normal University | 2vivo Mobile Communication Co. Ltd, Hangzhou, China | *Work done during internship at vivo | †Corresponding author
:boom: HighLight
Unlike the paper, this repo has been further optimized by:
-
Replace
LPIPS Loss (natively support 224 resolution)with the proposed DINOv3-ConvNeXt DISTS Loss (natively support 1k or higher resolution) for structural perception. -
Develop DINOv3-ConvNeXt Multi-level Discriminator Head (natively support 1k or higher resolution) for GAN training.
:boom: News
If you find OMGSR helpful, we hope for a :star:.
- 2026.3.11: :hugs: We will support for Z-Image (6B) and Longcat-Image (6B)
- 2025.10.14: :hugs: The latest version is released.
- 2025.8.16: The training code is released.
- 2025.8.15: The inference code and weights are released.
- 2025.8.12: The arXiv paper is released.
- 2025.8.6: This repo is released.
:eyes: Visualization
Please Click the images for detailed visualization.
OMGSR-F-1024 Results (Recommend)
1. RealLQ250x4 (256->1k Resolution) Complete Results
2. RealSRx8 (128->1k Resolution) Complete Results
3. DrealSRx8 (128->1k Resolution) Complete Results
OMGSR-S-512 Results
1. RealLQ250x4 (256->1k Resolution) Complete Results
2. RealLQ200x4 (256->1k Resolution) Complete Results
3. RealSRx4 (128->512 Resolution) Complete Results
4. DrealSRx4 (128->512 Resolution) Complete Results
Averge Optimal Mid-timestep via Signal-to-Noise Ratio (SNR)
1. Pre-trained Noisy Latent Representation
2. SNR of Pre-trained Noisy Latent Representation
3. SNR of Low-Quality (LQ) Image Latent Representation
4. Compute Averge Optimal Mid-timestep
5. Mid-timestep Script
You can run the script:
# OMGSR-S-512
python mid_timestep/mid_timestep_sd.py --dataset_txt_or_dir_paths /path1/to/images /path2/to/images
# OMGSR-F-1024
python mid_timestep/mid_timestep_flux.py --dataset_txt_or_dir_paths /path1/to/images /path2/to/images
- In this repo, we using mid-timestep
273forOMGSR-S-512and244forOMGSR-F-1024. - In fact, a mid-timestep around the recommended value is also ok and does not need to be very accurate.
- Note that the mid-timesteps during training and inference should be consistent.
- The mid-timestep is actually related to degraded configuration in a dataset.
:wrench: Environment
# git clone this repository
git clone https://github.com/wuer5/OMGSR.git
cd OMGSR
# create an environment
conda create -n OMGSR python=3.10
conda activate OMGSR
pip install --upgrade pip
pip install -r requirements.txt
:rocket: Quick Inference
1. Download the pre-trained models from HuggingFace
- Download SD2.1-base for OMGSR-S-512.
- Download FLUX.1-dev for OMGSR-F-1024.
2. Download the OMGSR Lora adapter weights
-
Download the OMGSR-S-512 Lora Adapter Weight (rename it as
omgsr-s-512-adapter) to the folderadapters(please make the folder). -
Download the OMGSR-F-1024 Lora Adapter Weight (rename it as
omgsr-f-1024-adapter) to the folderadapters(please make the folder).
3. Prepare your testing data
You should put the testing data (.png, .jpg, .jpeg formats) to the folder tests.
4. Start inference
For OMGSR-S-512:
bash infer_omgsr_s.sh
For OMGSR-F-1024:
bash infer_omgsr_f.sh
:hugs: Training
1. Prepare your training datasets
You should download the training datasets LSDIR and FFHQ (first 10k images) followed by our paper settings or your custom datasets.
You need to edit dataset_txt_or_dir_paths in the configs/xxx.yml like:
dataset_txt_or_dir_paths: [path1, path2, ...]
Note that path1, path2, ... can be the .txt path (containing the paths of training images) or the folder path (containing the training images). The type of images can be png, jpg, jpeg.
2. Download the DINOv3-ConvNeXt
You can download the DINOv3-ConvNeXt-Large to the folder dinov3_gan/dinov3_weights (please make the folder).
3. Prepare your training datasets
Start to train OMGSR-S-512:
bash train_omgsr_s_512.sh
Start to train OMGSR-F-1024:
bash train_omgsr_f_1024.sh
:book: Citation
If OMGSR is helpful to you, you could cite this paper.
@misc{wu2025omgsrneedmidtimestepguidance,
title={OMGSR: You Only Need One Mid-timestep Guidance for Real-World Image Super-Resolution},
author={Zhiqiang Wu and Zhaomang Sun and Tong Zhou and Bingtao Fu and Ji Cong and Yitong Dong and Huaqi Zhang and Xuan Tang and Mingsong Chen and Xian Wei},
year={2025},
eprint={2508.08227},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2508.08227},
}
:thumbsup: Acknowledgement
The dinov3_gan folder in this project is modified from Vision-aided GAN and DINOv3. Thanks for these awesome work.
:email: Contact
If you have any questions, please contact 51265902095@stu.ecnu.edu.cn.














