README.md

June 26, 2026 ยท View on GitHub

Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion (2025 ICCV)

GitHub stars

Byeonghun Lee1* | Hyunmin Cho1* | Hong Gyu Choi2 | Soo Min Kang2 | Iljun Ahn2 | Kyong Hwan Jin1โ€ 

1Korea University, 2Independent Researcher

Overview

iRAG performs reference-based super-resolution in two stages:

  1. Reference retrieval (match/) โ€” for a low-resolution query, retrieve semantically similar high-resolution references from a database with a compact contrastive binary hash (VGG16 baseline or a CLIP encoder). The reference database is expanded with an SDEdit-based hallucination augmentation.
  2. Diffusion super-resolution (sr/) โ€” a latent-diffusion SR model trained end-to-end with a TTSR restoration branch that conditions on the retrieved reference to reconstruct the high-resolution image.
iRAG/
โ”œโ”€โ”€ match/                 # stage 1: reference retrieval + data augmentation
โ”‚   โ”œโ”€โ”€ main.py            #   train / evaluate the hash encoder
โ”‚   โ”œโ”€โ”€ model/ loss/ utils/
โ”‚   โ””โ”€โ”€ SR_Utils/          #   SDEdit augmentation + pair construction
โ””โ”€โ”€ sr/                    # stage 2: reference-based diffusion super-resolution
    โ”œโ”€โ”€ train.py           #   training
    โ”œโ”€โ”€ inference.py       #   reference-based SR sampling
    โ”œโ”€โ”€ eval.py            #   metrics (PSNR/SSIM/LPIPS/CLIP-IQA/MUSIQ + color fix)
    โ””โ”€โ”€ configs/irag.yaml  ldm/  basicsr/  scripts/

Environment

  • python 3.8
  • CUDA 11.8
conda env create -f environment.yml
pip install -r sr/requirements.txt

Dataset

Download DIV2K, Flickr2K, CUFED5, and OST dataset.


Stage 1 โ€” Reference retrieval (match/)

Data augmentation (SDEdit hallucination)

Expand the reference database by hallucinating realistic variants of each image.

python match/SR_Utils/SDEdit_Hallucination/generate_data.py \
  --img_path <INPUT_IMG> \
  --prompt "<PROMPT_TEXT>" \
  --strength_base <0-1> --strength_span <0-1> \
  --guidance_base <VAL> --guidance_span <VAL> \
  --iterations <N> --num_samples <M> \
  --model_name <HF_MODEL_ID> \
  --device <cuda|cpu> --gpu <GPU_ID> \
  --out_folder <LOW_DIR> --out_folder_high <HIGH_DIR>

Train / retrieve

cd match
python main.py \
  --query_path <QUERY_DIR> \
  --database_path <DB_DIR> \
  --encode_length <BITS> \
  --batch_size <N> \
  --epochs <E> \
  --lr <LR> \
  --num_runs <RUNS> \
  --validate_frequency <VAL_FREQ> \
  --num_workers <WORKERS> \
  --seed <SEED> \
  --device <GPU_ID> \
  [--train] \
  [--use_clip] \
  [--num_bad_epochs <M>] \
  [--ckpt_path <CKPT_FILE>]

Stage 2 โ€” Diffusion super-resolution (sr/)

All commands below are run from sr/. Each data root holds paired gt/, sr_bicubic/, lr/ and ref/ subfolders sharing the same basenames.

Training

cd sr
python train.py --train \
  --base configs/irag.yaml \
  --gpus 0, --name irag --scale_lr False

Evaluation

cd sr
python eval.py \
  --config  configs/irag.yaml \
  --ckpt    path/to/iRAG.ckpt \
  --val-dir PATH/TO/valset \
  --ddim-steps 50 --colorfix all

A wavelet or AdaIN color fix (using the bicubic LQ as the color reference) can be applied to the SR output before scoring; --colorfix all reports none/adain/wavelet together in a single diffusion pass.

Download the pretrained models

Download the pretrained models from pretrained.

Citations

@InProceedings{lee2025irag,
    author    = {Lee, Byeonghun and Cho, Hyunmin and Choi, Hong Gyu and Kang, Soo Min and Ahn, Iljun and Jin, Kyong Hwan},
    title     = {Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {10764-10774}
}

License

This project and related weights are released under the Apache 2.0 license.