INSTALLATION.md
May 27, 2023 ยท View on GitHub
1. Installation
1.1 Env
The environment is tested with Ubuntu 20.04 and Python 3.8, with NVIDIA GPU plus CUDA enabled. Anaconda or Miniconda is recommended to install the running environment. All the packages dependencies can be found in e4s_env.yaml, and it's convinient to create a conda environment via conda env create -f e4s_env.yaml command.
๐ก Hint: If you find some problems when installing dlib, please consider to install it from conda forge or build it manually.
If you plan to use SegNeXt-FaceParser as described in section 1.3.1 below, some extra effort is needed. Click >>this link<< to forward to the installation guidance of SegNeXt-FaceParser. mmcv-full==1.5.1 + mmcls==0.20.1 + latest SegNeXt-FaceParser version is tested.
1.2 pre-trained model
We provide a pre-trained RGI model that was trained on FFHQ dataset for 300K iterations, please fetch the model from this Google Drive link and place it in the pretrained_ckpts/e4s folder.
1.3 Other dependencies
1.3.1 Face Parser
We use face parser to estimate the facial segmentation. Currently, we provide the following two pre-trained face parsers:
-
face-parsing.PyTorch (the default one): repo
Please download the pre-trained model here, and place it in the
pretrained_ckpts/face_parsingfolder. -
SegNeXt-FaceParser: repo
Please download the pre-trained SegNeXt model (small | base), and place it in the pretrained_ckpts/face_parsing folder. The corresponding configuration files are already included in the
pretrained_ckpts/face_parsingfolder.
๐ก Hint: The following FaceVid2Vid and GPEN are only applied for face swapping. If noly face editing is needed, just skip to Section 2 directly.
1.3.2 FaceVid2Vid: paper | unofficial-repo
This face reenactment model is applied to drive source face to show similar pose and expression as the target. Currently, we use zhanglonghao's impl. of FaceVid2Vid, where the pre-trained model can be downloaded here (Vox-256-New). Similarly, please put it in the pretrained_ckpts/facevid2vid folder.
1.3.3 GPEN: paper | repo
A face restoration model (GPEN) is used to improve the resolution of the intermediate driven face. You can execute the following script to fetch them automatically:
cd pretrained_ckpts/gpen
sh ./fetch_gpen_models.sh
Alternatively, you can download the pre-trained models manually as follows:
| Model | download link |
|---|---|
| RetinaFace-R50 | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/RetinaFace-R50.pth, for face detection |
| RealESRNet_x4 | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/realesrnet_x4.pth, for x4 super resolution |
| GPEN-BFR-512 | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/GPEN-BFR-512.pth, GEPN pre-trained model |
| ParseNet | https://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/models/ParseNet-latest.pth, for face parsing |
Make sure to place these checkpoint files in pretrained_ckpts/gpen/weights folder.
2. Sanity check
After fetching these checkpoints, your pretrained_ckpts folder should be same as:
pretrained_ckpts/
โโโ auxiliray (optional for training)
โ โโโ model_ir_se50.pth
โ โโโ model.pth
โโโ e4s
โ โโโ iteration_300000.pt
โโโ face_parsing
โ โโโ 79999_iter.pth
โ โโโ segnext.tiny.512x512.celebamaskhq.160k.py
โ โโโ segnext.tiny.best_mIoU_iter_160000.pth (optional)
โ โโโ segnext.base.512x512.celebamaskhq.160k.py
โ โโโ segnext.base.best_mIoU_iter_140000.pth (optional)
โ โโโ segnext.small.512x512.celebamaskhq.160k.py
โ โโโ segnext.small.best_mIoU_iter_140000.pth (optional)
โ โโโ segnext.large.512x512.celebamaskhq.160k.py
โ โโโ segnext.large.best_mIoU_iter_150000.pth (optional)
โโโ facevid2vid
โ โโโ 00000189-checkpoint.pth.tar
โ โโโ vox-256.yaml
โโโ gpen
โ โโโ fetch_gepn_models.sh
โ โโโ weights
โ โโโ GPEN-BFR-512.pth
โ โโโ ParseNet-latest.pth
โ โโโ realesrnet_x4.pth
โ โโโ RetinaFace-R50.pth
โโโ put_ckpts_accordingly.txt
โโโ stylegan2 (optional for training)
โโโ stylegan2-ffhq-config-f.pt