readme.md

May 10, 2024 · View on GitHub

neosr

neosr is a framework for training real-world single-image super-resolution networks. wiki.

Join our Discord

news

05.09.2024 - Release Real-PLKSR network. wiki
05.08.2024 - Release Nomos-v2 dataset. For more details, see datasets

🤝 support me

Tip

Consider supporting me on KoFi ☕ or Patreon

💻 installation

Requires Python 3.11 and CUDA =>11.8. Install latest Pytorch (=>2.1) and TorchVision (required).

Clone the repository:

git clone https://github.com/muslll/neosr
cd neosr

Then install other dependencies via pip:

pip install -e .

Alternatively, use poetry (recommended on linux):

poetry install
poetry add torch@latest torchvision@latest

Note: You must use poetry shell to enter the env after installation.

(optional) If you want to convert your models (convert.py), you need the following dependencies:

pip install onnx onnxruntime-gpu onnxconverter-common onnxsim

You can also install using poetry (recommended on linux):

poetry add onnx onnxruntime-gpu onnxconverter-common onnxsim

Please read the wiki tutorial for converting your models.

quick start

Start training by running:

python train.py -opt options.yml

Where options.yml is a configuration file. Templates can be found in options.

Tip

Please read the wiki Configuration Walkthrough for an explanation of each option.

features

Supported archs:

arch	option
Real-ESRGAN	`esrgan`
SRVGGNetCompact	`compact`
SwinIR	`swinir_small`, `swinir_medium`
HAT	`hat_s`, `hat_m`, `hat_l`
OmniSR	`omnisr`
SRFormer	`srformer_light`, `srformer_medium`
DAT	`dat_small`, `dat_medium`, `dat_2`
DITN	`ditn`
DCTLSA	`dctlsa`
SPAN	`span`
Real-CUGAN	`cugan`
CRAFT	`craft`
SAFMN	`safmn`, `safmn_l`
RGT	`rgt`, `rgt_s`
ATD	`atd`, `atd_light`
PLKSR	`plksr`, `plksr_tiny`
RealPLKSR	`realplksr`, `realplksr_s`
DRCT	`drct`, `drct_l`, `drct_s`
EFEN	`efen`

Note

For all arch-specific parameters, read the wiki.

Under Testing

arch	option

Supported Discriminators:

net	option
U-Net w/ SN	`unet`
A2-FPN w/ SN	`a2fpn`
PatchGAN w/ SN	`patchgan`

Supported Optimizers:

optimizer	option
Adam	`Adam` or `adam`
AdamW	`AdamW` or `adamw`
NAdam	`NAdam` or `nadam`
Lion	`Lion` or `lion`
LAMB	`Lamb` or `lamb`
Adan	`Adan` or `adan`

Supported losses:

loss	option
L1 Loss	`L1Loss`, `l1`
L2 Loss	`MSELoss`, `l2`
Huber Loss	`HuberLoss`, `huber`
CHC (Clipped Huber with Cosine Similarity Loss)	`chc`, `chc_l2`
Perceptual Loss	`perceptual_opt`, `PerceptualLoss`
GAN	`gan_opt`, `GANLoss`, `MultiScaleGANLoss`
YCbCr Color Loss (bt601)	`color_opt`, `colorloss`
Luma Loss (CIE L*)	`luma_opt` `lumaloss`
MS-SSIM	`mssim_opt` `mssim`
LDL Loss	`ldl_opt`
Focal Frequency	`ff_opt`, `focalfrequencyloss`
DISTS	`dists_opt`, `dists`
Wavelet Guided	`wavelet_guided`
Gradient-Weighted	`gw_opt`, `gw_loss`

Supported Augmentations:

augmentation	option
Rotation	`use_rot`
Flip	`use_hflip`
MixUp	`mixup`
CutMix	`cutmix`
ResizeMix	`resizemix`
CutBlur	`cutblur`

Supported models:

model	description	option
Default	Base model, supports both Generator and Discriminator	`default`
OTF	Builds on top of `default`, adding Real-ESRGAN on-the-fly degradations	`otf`

Supported dataset loaders:

loader	option
Paired datasets	`paired`
Single datasets (for inference, no GT required)	`single`
Real-ESRGAN on-the-fly degradation	`otf`

As part of neosr, I have released a dataset series called Nomos. The purpose of these datasets is to distill only the best images from the academic and community datasets. A total of 14 datasets were manually reviewed and processed, including: Adobe-MIT-5k, RAISE, LSDIR, LIU4k-v2, KONIQ-10k, Nikon LL RAW, DIV8k, FFHQ, Flickr2k, ModernAnimation1080_v2, Rawsamples, SignatureEdits, Hasselblad raw samples and Unsplash.

Nomos-v2 (recommended): contains 6000 images, multipurpose. Data distribution:

pie
  title Nomos-v2 distribution
  "Animal / fur" : 439
  "Interiors" : 280
  "Exteriors / misc" : 696
  "Architecture / geometric" : 1470
  "Drawing / painting / anime" : 1076
  "Humans" : 598
  "Mountain / Rocks" : 317
  "Text" : 102
  "Textures" : 439
  "Vegetation" : 574

nomos_uni (recommended for lightweight networks): contains 2989 images, multipurpose. Meant to be used on lightweight networks (<800k parameters).
hfa2k: contains 2568 anime images.

dataset download	sha256
nomosv2 (3GB)	sha256
nomosv2.lmdb (3GB)	sha256
nomosv2_lq_4x (187MB)	sha256
nomosv2_lq_4x.lmdb (187MB)	sha256
nomos_uni (1.3GB)	sha256
nomos_uni.lmdb (1.3GB)	sha256
nomos_uni_lq_4x	sha256
nomos_uni_lq_4x.lmdb	sha256
hfa2k	sha256

community datasets

Datasets made by the upscaling community. More info can be found in the Enhance Everything discord

FaceUp: Curated version of FFHQ
SSDIR: Curated version of LSDIR.
kim's 8k Dataset V2: Video Game Dataset

dataset	download
@Kim2091 8k Dataset V2	GDrive (33.5GB)
@Phhofm FaceUp	GDrive (4GB)
@Phhofm SSDIR	Gdrive (4.5GB)

resources

📄 license and acknowledgements

Released under the Apache license. All licenses listed on license/readme. This code was originally based on BasicSR.

Thanks to victorca25/traiNNer, styler00dollar/Colab-traiNNer and timm for providing helpful insights into some problems.

Thanks to contributors @Phhofm, @Sirosky, @Kim2091, @terrainer, @Corpsecreate and @umzi2 for helping with tests and bug reporting.