README.md
June 26, 2026 ยท View on GitHub
Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion (2025 ICCV)
Byeonghun Lee1* | Hyunmin Cho1* | Hong Gyu Choi2 | Soo Min Kang2 | Iljun Ahn2 | Kyong Hwan Jin1โ
1Korea University, 2Independent Researcher
Overview
iRAG performs reference-based super-resolution in two stages:
- Reference retrieval (
match/) โ for a low-resolution query, retrieve semantically similar high-resolution references from a database with a compact contrastive binary hash (VGG16 baseline or a CLIP encoder). The reference database is expanded with an SDEdit-based hallucination augmentation. - Diffusion super-resolution (
sr/) โ a latent-diffusion SR model trained end-to-end with a TTSR restoration branch that conditions on the retrieved reference to reconstruct the high-resolution image.
iRAG/
โโโ match/ # stage 1: reference retrieval + data augmentation
โ โโโ main.py # train / evaluate the hash encoder
โ โโโ model/ loss/ utils/
โ โโโ SR_Utils/ # SDEdit augmentation + pair construction
โโโ sr/ # stage 2: reference-based diffusion super-resolution
โโโ train.py # training
โโโ inference.py # reference-based SR sampling
โโโ eval.py # metrics (PSNR/SSIM/LPIPS/CLIP-IQA/MUSIQ + color fix)
โโโ configs/irag.yaml ldm/ basicsr/ scripts/
Environment
- python 3.8
- CUDA 11.8
conda env create -f environment.yml
pip install -r sr/requirements.txt
Dataset
Download DIV2K, Flickr2K, CUFED5, and OST dataset.
Stage 1 โ Reference retrieval (match/)
Data augmentation (SDEdit hallucination)
Expand the reference database by hallucinating realistic variants of each image.
python match/SR_Utils/SDEdit_Hallucination/generate_data.py \
--img_path <INPUT_IMG> \
--prompt "<PROMPT_TEXT>" \
--strength_base <0-1> --strength_span <0-1> \
--guidance_base <VAL> --guidance_span <VAL> \
--iterations <N> --num_samples <M> \
--model_name <HF_MODEL_ID> \
--device <cuda|cpu> --gpu <GPU_ID> \
--out_folder <LOW_DIR> --out_folder_high <HIGH_DIR>
Train / retrieve
cd match
python main.py \
--query_path <QUERY_DIR> \
--database_path <DB_DIR> \
--encode_length <BITS> \
--batch_size <N> \
--epochs <E> \
--lr <LR> \
--num_runs <RUNS> \
--validate_frequency <VAL_FREQ> \
--num_workers <WORKERS> \
--seed <SEED> \
--device <GPU_ID> \
[--train] \
[--use_clip] \
[--num_bad_epochs <M>] \
[--ckpt_path <CKPT_FILE>]
Stage 2 โ Diffusion super-resolution (sr/)
All commands below are run from sr/. Each data root holds paired gt/, sr_bicubic/, lr/ and ref/
subfolders sharing the same basenames.
Training
cd sr
python train.py --train \
--base configs/irag.yaml \
--gpus 0, --name irag --scale_lr False
Evaluation
cd sr
python eval.py \
--config configs/irag.yaml \
--ckpt path/to/iRAG.ckpt \
--val-dir PATH/TO/valset \
--ddim-steps 50 --colorfix all
A wavelet or AdaIN color fix (using the bicubic LQ as the color reference) can be
applied to the SR output before scoring; --colorfix all reports
none/adain/wavelet together in a single diffusion pass.
Download the pretrained models
Download the pretrained models from pretrained.
Citations
@InProceedings{lee2025irag,
author = {Lee, Byeonghun and Cho, Hyunmin and Choi, Hong Gyu and Kang, Soo Min and Ahn, Iljun and Jin, Kyong Hwan},
title = {Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2025},
pages = {10764-10774}
}
License
This project and related weights are released under the Apache 2.0 license.