README.md

July 22, 2022 · View on GitHub

A Survey on Leveraging Pre-trained Generative Adversarial Networks for Image Editing and Restoration

arXiv

Abstract

Generative adversarial networks (GANs) have drawn enormous attention due to the simple yet effective training mechanism and superior image generation quality. With the ability to generate photo-realistic high-resolution (e.g., $1024\times1024$) images, recent GAN models have greatly narrowed the gaps between the generated images and the real ones. Therefore, many recent works show emerging interest to take advantage of pre-trained GAN models by exploiting the well-disentangled latent space and the learned GAN priors. We briefly review recent progress on leveraging pre-trained large-scale GAN models from three aspects, i.e., 1) the training of large-scale generative adversarial networks, 2) exploring and understanding the pre-trained GAN models, and 3) leveraging these models for subsequent tasks like image restoration and editing.

Contents

Figure 1. Illustration of GAN inversion methods.
Illustration

In this figure, x\mathbf{x} and x^\mathbf{\hat{x}} are given real image and generated image, respectively. The red dotted line means supervision. It can be seen that the in-domain constraint requires the generated image x^\mathbf{\hat{x}} can be inverted back into the latent space. Here, z\mathbf{z} is not restricted in Z\mathcal{Z} space, and may refer to more generic latent code (e.g., w\mathbf{w}, f\mathbf{f}, etc).

Figure Content (PDF file here)

Figure 2. A Summary of Relevant Papers If you want to get the raw file, please refer to ProcessOn.com (passcode: 1qaz)

Figure 3. Illustration of recent GAN models (see (a)$\sim$(d)) and the latent spaces of StyleGAN series (see (e)).
Illustration (a) For PGGAN, the blue part denotes the progressive growing procedure from \$4\times4$ to \$8\times8$. The components with dash lines are employed for the fade-in strategy, where $\alpha$ is gradually growing to 1. They are discarded when the model grows to a higher-resolution. (b) For BigGAN, a specific noise is delivered to each layer together with the class embedding, and the model is end-to-end trained without the progressive growing procedure. (c) For StyleGAN, a series of FC layers are deployed to map $\mathbf{z}$ into $\mathbf{w}$. The green only belongs to StyleGAN2. (d) For StyleGAN3, the generator is largely modulated to improve the translational and rotation equivariance. The discriminator is omitted since it is identical with that used in StyleGAN2. (e) For simplicity, here we take the StyleGAN series as an example to show the latent spaces based on GAN inversion task.
Figure Content (PDF file here)

Table 1. A summary of GAN inversion and methods leveraging pre-trained GANs for image editing and restoration.
Illustration

For the inversion method, "O", "L", "T" represent optimization-based, learning-based, and training-based (or fine-tuning) methods, while "/" means no inversion is performed in this method, and the numbers (without square brackets) are the indices of methods used for inversion in this table. Note that the methods are ordered (roughly) according to publicly accessible time (e.g., the appear time on ArXiv, openreview.net, CVF Open Access, etc.).

Abbreviations

^\ast Abbreviations: AD (ADE20K), AF (AFHQ), CA (CelebA), CD (CACD), CF (CIFAR), CH (CelebA-HQ), CM (CelebAMask-HQ), CO (MS COCO), CS (CityScapes), CU (Caltech-UCSD Birds), DA (Danbooru, aka Anime Faces), DF (DeepFashion), FF (FFHQ), FL (Flowers), IN (ImageNet), LF (LFW), LS (LSUN), MF (MetFaces), MN (MNIST), OM (Omniglot), P3 (Places365), PL (Places), PT (Oxford-IIIT Pet, aka Cats and Dogs), RA (RAVDESS), SC (Stanford Cars), SS (Streetscape), SV (SVHN), TR (Transient), UT (UT Zappos50K)

^\dagger Abbreviations: AD (Adversarial Defense), AE (Attribute Editing, i.e., w/o reference), AN (Anomaly Detection), AR (Artifacts Removal), AT (Attribute Transfer, i.e., w/ reference), CO (Image Crossover), [U]DA ([Unsupervised] Domain Adaptation), DN (Image Denoising), FF (Face Frontalization), IC (Image Colorization), IG (Image Generation), IH (Information Hiding), Int (Interpolation), Inv (Inversion), IP (Inpainting), PI (Parsing or Segmentation to Image), SI (Sketch to Image), SR (Image Super-resolution), ST (Style Transfer), TR (Transform and Random Jittering).

^\ddagger Some custom datasets collected or regenerated by the authors are omitted since they are not publicly available or can be generated automatically based on current public datasets.

Table Content
No.MethodPublicationBackboneLatent SpaceInversion MethodDataset^\astApplication^\dagger
1BiGAN (Link) (Code)ICLR 2017/Z\mathcal{Z}TMN, INInv
2ALI (Link) (Code)ICLR 2017/Z\mathcal{Z}TCF, SV, CA, INInv, Int
3Zhu et al. (Link) (Code)ECCV 2016DCGANZ\mathcal{Z}L, OSH, LS, PL^\ddaggerInv, Int, AE
4IcGAN (Link) (Code)NeurIPSw 2016cGANZ\mathcal{Z}, C\mathcal{C}LMN, CAInv, AT, AE
5Creswell et al. (Link) (Code)T-NNLS 2018DCGAN, WGAN-GPZ\mathcal{Z}OOM, UT, CAInv
6Lipton et al. (Link) (Code)ICLRw 2017DCGANZ\mathcal{Z}OCAInv
7PGD-GAN (Link) (Code)ICASSP 2018DCGANZ\mathcal{Z}OMN, CAInv
8Ma et al. (Link) (Code)NeurIPS 2018DCGANZ\mathcal{Z}OMN, CAInv, IP
9Suzuki et al. (Link) (Code)ArXiv 2018SNGAN, BigGAN, StyleGANF\mathcal{F}3IN, FL, FF, DACO
10GANDissection (Link) (Code)ICLR 2019PGGANF\mathcal{F}/LS, ADAE, AR
11NPGD (Link) (Code)ICCV 2019DCGAN, SAGANZ\mathcal{Z}L, OMN, CA, LSInv, SR, IP
12Image2StyleGAN (Link) (Code)ICCV 2019StyleGANW+\mathcal{W}+OFF^\ddaggerInv, Int, AE, ST
13Bau et al. (Link) (Code)ICLRw 2019PGGAN, WGAN-GP, StyleGANZ\mathcal{Z}, W\mathcal{W}L, OLSInv
14GANPaint (Link) (Demo)ToG 2019PGGANZ\mathcal{Z}, Θ\ThetaL, O, TLSInv, AE
15InterFaceGAN(Link) (Code)CVPR 2020PGGAN, StyleGANZ\mathcal{Z}, W\mathcal{W}3, 8CHAE, AR
16GANSeeing(Link) (Code)ICCV 2019PGGAN, WGAN-GP, StyleGANZ\mathcal{Z}, W\mathcal{W}13LSInv
17YLG(Link) (Code)CVPR 2020SAGANZ\mathcal{Z}OINInv
18Image2StyleGAN++(Link) (Video)CVPR 2020StyleGANW+\mathcal{W}+, N\mathcal{N}OLS, FFInv, CO, IP, AE, ST
19mGANPrior(Link) (Code)CVPR 2020PGGAN, StyleGANZ\mathcal{Z}OFF, CH, LSInv, IC, SR, IP, DN, AE
20MimicGAN(Link)IJCV 2020DCGANZ\mathcal{Z}OCA, FF, LFInv, UDA, AD, AN
21PULSE(Link) (Code)CVPR 2020StyleGANZ\mathcal{Z}OFF, CHInv, SR
22DGP(Link) (Code)ECCV 2020BigGANZ\mathcal{Z}O, TIN, P3Inv, Int, IC, IP, SR, AD, TR, AE
23StyleGAN2Distillation(Link) (Code)ECCV 2020StyleGAN2, pix2pixHDW+\mathcal{W}+/FFAT, AE
24EditingInStyle(Link) (Code)CVPR 2020PGGAN, StyleGAN, StyleGAN2F\mathcal{F}/FF, LSAT
25StyleRig(Link) (Video)CVPR 2020StyleGANW+\mathcal{W}+/FFAT
26ALAE(Link) (Code)CVPR 2020StyleGANW\mathcal{W}TMN, FF, LS, CHInv, AT
27IDInvert(Link) (Code)ECCV 2020StyleGANW+\mathcal{W}+L, OFF, LSInv, Int, AE, CO
28pix2latent(Link) (Code)ECCV 2020BigGAN, StyleGAN2Z\mathcal{Z}OIN, CO, CF, LSInv, TR, AE
29IDDistanglement(Link) (Code)ToG 2020StyleGANW\mathcal{W}LFFInv, AT
30WhenAndHow(Link)ArXiv 2020MLPZ\mathcal{Z}OMNInv, IP
31Guan et al.(Link)ArXiv 2020StyleGANW+\mathcal{W}+L, OCH, CDInv, Int, AT, IC
32SeFa(Link) (Code)CVPR 2021PGGAN, BigGAN, StyleGANZ\mathcal{Z}19, 27FF, CH, LS, IN, SS, DAAE
33GH-Feat(Link) (Code)CVPR 2021StyleGANS\mathcal{S}LMN, FF, LS, INInv, AT, AE
34pSp(Link) (Code)CVPR 2021StyleGAN2W+\mathcal{W}+LFF, AF, CH, CMInv, FF, SI, SR
35StyleFlow(Link) (Code)ToG 2021StyleGAN, StyleGAN2W+\mathcal{W}+12FF, LSAT, AE
36PIE(Link) (Code)ToG 2020StyleGANW+\mathcal{W}+OFFAT, AE
37Bartz et al.(Link) (Code)BMVC 2020StyleGAN, StyleGAN2Z\mathcal{Z}, W+\mathcal{W}+LFF, LSInv, DN
38StyleIntervention(Link)ArXiv 2020StyleGAN2S\mathcal{S}OFFInv, AE
39StyleSpace(Link) (Code)CVPR 2021StyleGAN2S\mathcal{S}OFF, LSInv, AE
40Hijack-GAN(Link) (Code)CVPR 2021PGGAN, StyleGANZ\mathcal{Z}/CHAE
41NaviGAN(Link) (Code)CVPR 2021pix2pixHD, BigGAN, StyleGAN2Θ\ThetaStyleGAN2FF, LS, CS, INAE
42GLEAN(Link) (Code)CVPR 2021StyleGANW+\mathcal{W}+LFF, LSInv, SR
43ImprovedGANEmbedding(Link) (Code)ArXiv 2020StyleGAN, StyleGAN2P\mathcal{P}OFF, MF^\ddaggerInv, IC, IP, SR
44GFPGAN(Link) (Code)CVPR 2021StyleGAN2W\mathcal{W}LFFInv, SR
45EnjoyEditing(Link) (Code)ICLR 2021PGGAN, StyleGAN2Z\mathcal{Z}12FF, CA, CH, P3, TRInv, AE
46SAM(Link) (Code)ToG 2021StyleGANW+\mathcal{W}+LCA, CHAE
47e4e(Link) (Code)ToG 2021StyleGAN2W+\mathcal{W}+LFF, CH, LS, SCInv, AE
48StyleCLIP(Link) (Code)ICCV 2021StyleGAN2W+\mathcal{W}+, S\mathcal{S}47, OFF, CH, LS, AFAE
49LatentComposition(Link) (Code)ICLR 2021PGGAN, StyleGAN2Z\mathcal{Z}LFF, CH, LSInv, IP, AT
50GANEnsembling(Link) (Code)CVPR 2021StyleGAN2W+\mathcal{W}+L, OCH, SC, PTInv, AT
51ReStyle(Link) (Code)ICCV 2021StyleGAN2W+\mathcal{W}+LFF, CH, SC, LS, AFInv, AE
52E2Style(Link) (Code)T-IP 2022StyleGAN2W+\mathcal{W}+LFF, CHInv, SI, PI, AT, IP, SR, AE, IH
53GPEN(Link) (Code)CVPR 2021StyleGAN2W+\mathcal{W}+, N\mathcal{N}LFF, CHInv, SR
54Consecutive(Link) (Code)ICCV 2021StyleGANW+\mathcal{W}+OFF, RAInv, Int, AE
55BDInvert(Link) (Code)ICCV 2021StyleGAN, StyleGAN2F\mathcal{F}/W+\mathcal{W}+OFF, CH, LSInv, AE
56HFGI(Link) (Code)CVPR 2022StyleGAN2W+\mathcal{W}+, F\mathcal{F}LFF, CH, SCInv, AE
57VisualVocab(Link) (Code)ICCV 2021BigGANZ\mathcal{Z}/P3, INAE
58HyperStyle(Link) (Code)CVPR 2022StyleGAN2W+\mathcal{W}+LFF, CH, AFInv, AE, ST
59GANGealing(Link) (Code)CVPR 2022StyleGAN2W\mathcal{W}/LS, FF, AF, CH, CUTR
60HyperInverter(Link) (Code)CVPR 2022StyleGAN2W\mathcal{W}, Θ\ThetaLFF, CH, LSInv, Int, AE
61InsetGAN(Link) (Code)CVPR 2022StyleGAN2W+\mathcal{W}+OFF, DF^\ddaggerCO, IG
62HairMapper(Link) (Code)CVPR 2022StyleGAN2W+\mathcal{W}+47FF, CM^\ddaggerAE
63SAMInv(Link) (Code)CVPR 2022BigGAN-deep, StyleGAN2W+\mathcal{W}+, F\mathcal{F}LFF, LS, INInv, AE

Contributions

Pull requests are welcome for error correction and content expansion!

Tips:

  1. The tables in latex and markdown can be generated by tablesgenerator.com
  2. You can download our table content from here, and load it (or your own CSV files) in tablesgenerator.com

Citation

Please find more details in our paper. If you find it useful, please consider citing

@article{liu2022pretrainedGANs,
  title={A Survey on Leveraging Pre-trained Generative Adversarial Networks for Image Editing and Restoration},
  author={Liu, Ming and Wei, Yuxiang and Wu, Xiaohe and Zuo, Wangmeng and Zhang, Lei},
  journal={arXiv preprint arXiv:2207.10309},
  year={2022}
}