ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization
April 20, 2026 ยท View on GitHub
Project Page | Paper (ArXiv) | Supplemental Material | Code (Github)
This repository is the official pytorch implementation of our paper, ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization.
Yixin Yang,
Jiangxin Dong,
Jinhui Tang,
Jinshan Pan
Nanjing University of Science and Technology
๐ฅ News
- [2026-04-20] Thanks @dan64 for developing the CMNET2 project, which extends ColorMNet with an improved three-tier memory architecture inspired by XMem++, enabling robust colorization of long videos with hundreds of reference frames, for more details please see #issue22.
- [2025-10-05] Integrated with ๐ค Hugging Face!
Try out the online demo here โ.
Note: Due to the HF Pro Zero-GPU quota, this space currently has only 25 minutes of Zero-GPU runtime per day. Please consider running the demo locally app.py or on Colab if you need more time. - [2025-10-05] Add Gradio demo, see app.py
- [2024-11-14] Add matrics evaluation code, see evaluation.py. Demo command
pip install lpips && python evaluation_matrics/evaluation.py. - [2024-09-09] Add training code, see train.py.
- [2024-09-09] Colab demo for ColorMNet is available at
.
- [2024-09-07] Add inference code and pretrained weights, see test.py.
- [2024-04-13] Project page released at ColorMNet Project. Please be patient and stay updated.
Requirements
- Python 3.10+. This is a GPU-based project. A CUDA-capable NVIDIA GPU is required. The
spatial_correlation_samplerdependency (installed viaPytorch-Correlation-extension) requires CUDA and will not run on CPU-only setups. - PyTorch 1.11+ (See PyTorch for installation instructions)
torchvisioncorresponding to the PyTorch version- OpenCV (try
pip install opencv-python) - Others:
pip install -r requirements.txt
:briefcase: Dependencies and Installation
# git clone this repository
conda create -n colormnet python=3.10 -y
conda activate colormnet
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --index-url https://download.pytorch.org/whl/cu118
# install py-thin-plate-spline
git clone https://github.com/cheind/py-thin-plate-spline.git
cd py-thin-plate-spline && pip install -e . && cd ..
# install Pytorch-Correlation-extension
git clone https://github.com/ClementPinard/Pytorch-Correlation-extension.git
cd Pytorch-Correlation-extension && python setup.py install && cd ..
pip install -r requirements.txt
:gift: Checkpoints
Download the pretrained models manually and put them in ./saves (create the folder if it doesn't exist).
| Name | URL |
|---|---|
| ColorMNet | model |
:zap: Quick Inference
-
Test on Images:
For Windows users, please follow RuntimeError to avoid multiprocessor Runtime error in data loader. Thanks to @UPstud.
CUDA_VISIBLE_DEVICES=0 python test.py
# Add --FirstFrameIsNotExemplar if the reference frame is not exactly the first input image. Please make sure the ref frame and the input frames are of the same size.
Gradio Demo:
CUDA_VISIBLE_DEVICES=0 python app.py
Train
Dataset structure for both the training set and the validation set
# Specify --davis_root and --validation_root
data_root/
โโโ 001/
โ โโโ 00000.png
โ โโโ 00001.png
โ โโโ 00002.png
โ โโโ ...
โโโ 002/
โ โโโ 00000.png
โ โโโ 00001.png
โ โโโ 00002.png
โ โโโ ...
โโโ ...
Training script
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.run \
--master_port 25205 \
--nproc_per_node=1 \
train.py \
--exp_id DINOv2FeatureV6_LocalAtten_DAVISVidevo \
--davis_root /path/to/your/training/data/\
--validation_root /path/to/your/validation/data\
--savepath ./wandb_save_dir
To Do
- Release training code
- Release testing code
- Release pre-trained models
- Release demo
Citation
If our work is useful for your research, please consider citing:
@inproceedings{yang2024colormnet,
author = {Yang, Yixin and Dong, Jiangxin and Tang, Jinhui and Pan Jinshan},
title = {ColorMNet: A Memory-based Deep Spatial-Temporal Feature Propagation Network for Video Colorization},
booktitle = {ECCV},
year = {2024}
}
License
This project is licensed under BY-NC-SA 4.0, while some methods adopted in this project are with other licenses. Please refer to LICENSES.md for the careful check. Redistribution and use should follow this license.
Acknowledgement
This project is based on XMem. Some codes are brought from DINOv2. Thanks for their awesome works.
Contact
This repo is currently maintained by Yixin Yang (@yyang181) and is for academic research use only.