VCGAN
June 19, 2025 ยท View on GitHub
VCGAN is a hybrid colorization model that works for both image and video colorization tasks.
link: https://ieeexplore.ieee.org/document/9721653
1 Pre-requisite
Note that this project is implemented using Python 3.6, CUDA 8.0, and PyTorch 1.0.0 (minimum requirement).
Besides, the cupy, opencv, and scikit-image libs are used for this project.
Please build an appropriate environment for the PWC-Net to compute optical flow.
2 Visual comparisons of VCGAN and the other SOTA video colorization methods
So far we visualize two samples included in the main paper for better visualization.
2.1 DAVIS dataset - 'horsejump-high'
The representative image is shown as (please compare the significant regions marked by red rectangles):
The gifs are shown as:
| Column 1 | Column 2 | Column 3 | Column 4 |
|---|---|---|---|
| CIC | CIC+BTC | LTBC | LTBC+BTC |
| SCGAN | SCGAN+BTC | ChromaGAN | ChromaGAN+BTC |
| 3DVC | FAVC | VCGAN | |
| Grayscale | Ground Truth |













2.2 Videvo dataset - 'SkateboarderTableJump'
The representative image is shown as (please compare the significant regions marked by red rectangles):
The gifs are shown as:
| Column 1 | Column 2 | Column 3 | Column 4 |
|---|---|---|---|
| CIC | CIC+BTC | LTBC | LTBC+BTC |
| SCGAN | SCGAN+BTC | ChromaGAN | ChromaGAN+BTC |
| 3DVC | FAVC | VCGAN | |
| Grayscale | Ground Truth |













3 Visual comparisons of VCGAN and the other SOTA image colorization methods
We visualize the Figure 13 in the main paper for better view.
4 Download pre-trained model
4.1 Training
Please download pre-trained ResNet50-Instance-Normalized model and other pre-trained models at this:
link: https://pan.baidu.com/s/1UH_MyLwCKPXyQag3k57RZg?pwd=d2fn code: d2fn
It contains a folder named trained_models; then you can put it under train folder. The hyper-parameters follow the settings of original paper except normalization.
4.1 Testing
Please download pre-trained VCGAN models at this:
link: https://pan.baidu.com/s/1jp2aUUYFT4A81ArerIsjyQ?pwd=j3gy code: j3gy
It contains a folder named models; then you can put it under train folder. Note that there are three models provided. The model1_Second_Stage_epoch500_bs1_448p.pth is by default.
Note that they are re-trained on a single GPU that might lead to slightly different results compared with the original one.
5 Use the code
Enter the train folder:
cd train
5.1 Training code
Put the pre-trained ResNet50-Instance-Normalized model into trained_models folder, then change the settings and train VCGAN in first stage:
python train.py or sh first.sh # on 256x256 image resolution
After the model is trained, you can run following codes for second stage:
python train2.py or sh second.sh # on 256p video resolution
python train2.py or sh third.sh # on 480p video resolution
5.2 Testing code
For testing, please run (note that you need to change path to models):
python test_model_second_stage_by_txt.py # for DAVIS dataset
python test_model_second_stage_by_txt2.py # for videvo dataset
python test_model_second_stage_by_folder.py # for a single folder
The network interpolation can also be used when applying different models:
python network_interp.py
6 Related Projects
SCGAN: Saliency Map-guided Colorization with Generative Adversarial Network (IEEE TCSVT 2020): Project Paper Github
ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution (WACV 2020): Paper Github
FAVC: Fully Automatic Video Colorization With Self-Regularization and Diversity (CVPR 2019): Project Paper Github
3DVC: Automatic Video Colorization using 3D Conditional Generative Adversarial Networks (ISVC 2019): Paper
BTC: Learning Blind Video Temporal Consistency (ECCV 2018): Project Paper Github
LRAC: Learning Representations for Automatic Colorization (ECCV 2016): Project Paper Github
CIC: Colorful Image Colorization (ECCV 2016): Project Paper Github
LTBC: Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification (ACM TOG 2016): Project Paper Github
Pix2Pix: Image-to-Image Translation with Conditional Adversarial Nets (CVPR 2017): Project Paper Github
CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (ICCV 2017): Project Paper Github