README.md

October 13, 2025 · View on GitHub

Diffusion Models For Low-Light Image Enhancement: A Multi-Perspective Taxonomy And Performance Analysis

ArXiv

A structured and visual companion repository summarizing the paper


Why this repo

This README is the fast path through the survey. It gives a taxonomy cheat sheet, dataset and metric lookup, a decision guide for selecting LLIE diffusion methods, and links back to exact sections of the paper for detail. The paper presents a six-part taxonomy, a benchmark and metrics view, challenges, and forward directions.


TL;DR of the survey

  • Six-part taxonomy for diffusion models in LLIE: Intrinsic Decomposition, Spectral and Latent, Accelerated, Guided, Multimodal, Autonomous. The taxonomy is grounded in model mechanism and conditioning signals.
  • Benchmarks and analysis cover datasets, metrics, and a cross-benchmark performance landscape, with deployment tradeoffs among fidelity, perception, and efficiency.
  • Practical challenges include latency, generalization, data scarcity, interpretability, and ethics. The paper outlines directions for on-device use and foundation model adaptation.

Quick navigator

Use the paper’s HTML view for jump links. HTML‡arXiv

  • Key Observations
  • Background (LLIE problem framing and diffusion fundamentals)
  • Taxonomy (six categories with representative methods and tradeoffs)
  • Datasets and Metrics (FR, NR, distribution, and task-based)
  • Cross-Benchmark Landscape
  • Challenges (latency, generalization, data dependence, fidelity vs perception vs efficiency, interpretability, ethics)
  • Future Directions (foundation models, on-device real time, self supervised and zero shot, controllability)

Taxonomy cheat sheet

Short descriptions to help you decide what to read or build next. See Section 4 for details.

  • Intrinsic Decomposition
    Retinex or physics grounded formulations where diffusion operates with priors on illumination or reflectance. Best when interpretability and physical plausibility matter.

  • Spectral and Latent
    Operate in Fourier, wavelet, or latent spaces to reduce compute while preserving structure. Good for high resolution and faster sampling.

  • Accelerated
    Fewer steps via trajectory optimization, distillation, or latent shortcuts. The place to look for real time systems.

  • Guided
    Spatial masks, exposure controls, prompts, or instructions steer the enhancement. Useful for controllable brightness and region-aware edits.

  • Multimodal
    Fuse RGB with other sensors or align enhancement to downstream tasks. Robust in extreme darkness when sensors or auxiliary signals exist.

  • Autonomous
    Self supervised, zero shot, or UDA setups that reduce reliance on paired data and improve scalability across domains.


If you are here to choose a method

  • Need speed → read Accelerated and Spectral and Latent. Combine step reduction or distillation with latent or frequency spaces.
  • Need control → read Guided for exposure control, spatial masks, or instruction guidance.
  • Deploy on-device → see Challenges 6.1 and Future 7.2 on latency, memory, and energy, then pair with Accelerated.
  • Unpaired data → see Autonomous for zero shot and self supervised routes.
  • Downstream tasks → see Multimodal and task-aligned evaluation in 5.2.4.

Datasets at a glance

Common LLIE datasets referenced in the survey. Use this like a lookup card. The sizes and notes below match Table 2 in Section 5.1.

DatasetTypeSizeSummary
LOLPaired500 pairsMostly indoor, real capture with varying exposure and ISO.
LOLv2PairedReal 789, Synth 1000Indoor and outdoor, real capture and synthesis.
LSRWPaired5,650 pairsDSLR and smartphone, mild misalignment, diverse scenes.
SIDPaired RAW5,094 pairsExtreme low light RAW to RGB, Sony and Fuji subsets.
ExDarkUnpaired7,363Object labels for task-based checks, 12 classes.
SICEMulti exposure589 sequencesMEF and HDR style evaluation with under and over exposed content.
VE-LOLMixedL: 2.5k, H: 11kDiverse human centric content, face annotations in H.
NTIRE 2024Challenge (RAW)230 train, 70 val/testHigh resolution night scenes, real capture.
MIT FiveKPaired RAW5,000Expert retouch supervision for tonal edits.

Metrics guide

See Section 5.2 for pros, cons, and tradeoffs.

  • Full reference: PSNR, SSIM, LPIPS
  • No reference: NIQE, PI, BRISQUE family
  • Distribution: FID, KID, DISTS
  • Task based: impact on detection or recognition (mAP, accuracy)

Cross-benchmark landscape

The paper compares methods across datasets and shows how conclusions shift with metrics and domains. Read Section 5.3 before claiming a universal win. It may save you a revision cycle.


Practical challenges to plan for

  • Latency and compute: first constraint for real time and mobile targets. See 6.1 and 7.2.
  • Generalization across scenes and sensors: real darkness is not one distribution. See 6.2 and 6.3.
  • Fidelity vs perception vs efficiency: do not optimize one in isolation. See 6.4.
  • Interpretability and XAI: useful for safety and failure triage. See 6.5.
  • Ethics: strong enhancement can hallucinate plausible but false details. See 6.6.

Representative themes inside the taxonomy

Examples that appear in the survey narrative: guided exposure control and region-aware edits, multimodal fusion for robustness, and accelerated sampling or distillation for speed.

For a broader external index of diffusion papers in low-level vision, see this curated list.


How to use this repo

  • Use the Taxonomy to pick the right design axis for your application.
  • Use the Datasets table to choose training and evaluation splits that match your target domain.
  • Use the Metrics guide to decide the correct fidelity and perception balance.
  • Jump to the HTML view for details behind each choice.

Future directions worth tracking

  • Foundation model adaptation: steer large pretrained diffusion models toward LLIE with minimal fine tuning.
  • On-device pipelines: combine step reduction with latent or spectral operations for real time on phones and edge cameras.
  • Principled self supervised and zero shot: for transfer without paired data.
  • Better controllability and interpretability: important for professional workflows and safety contexts.

Contributing

If you spot a missing dataset quirk, a metric pitfall, or a new LLIE diffusion paper that fits the taxonomy, open a pull request. Include a short note on which taxonomy category it fits and the evaluation setting it uses.


📚 Citation

If you find this repository useful, please cite our paper:

@article{adhikarla2025diffusion,
  title={Diffusion Models for Low-Light Image Enhancement: A Multi-Perspective Taxonomy and Performance Analysis},
  author={Adhikarla, Eashan and Liu, Yixin and Davison, Brian D},
  journal={arXiv preprint arXiv:2510.05976},
  year={2025}
}