Generative Inbetweening through Frame-wise Conditions-Driven Video Generation
February 27, 2025 ยท View on GitHub
Tianyi Zhu, Dongwei Ren, Qilong Wang, Xiaohe Wu, Wangmeng Zuo
This repository is the official PyTorch implementation of "Generative Inbetweening through Frame-wise Conditions-Driven Video Generation".
๐ Our paper is accepted by CVPR 2025
๐ผ๏ธ Results
| Input starting frame | Input ending frame | Inbetweening results |
|
|
|
|
|
|
|
|
|
|
|
|
โ๏ธ Run inference demo
1. Setup environment
git clone https://github.com/Tian-one/FCVG.git
cd FCVG
conda create -n FCVG python=3.10.14
conda activate FCVG
pip install -r requirements.txt
2. Download models
-
Download the Gluestick weights and put them in './models/resources'.
wget https://github.com/cvg/GlueStick/releases/download/v0.1_arxiv/checkpoint_GlueStick_MD.tar -P models/resources/weights -
Download the DWPose pretrained weights dw-ll_ucoco_384.onnx and yolox_l.onnx here, then put them in './checkpoints/dwpose'.
-
Download our FCVG model here, put them in './checkpoints'
3. Run the inference script
Run inference with default setting:
bash demo.sh
or run
python demo_FCVG.py
--pretrained_model_name_or_path: pretrained SVD model folder, we fintune models based on SVD-XT1.1
--controlnext_path: ControlNeXt model path
--unet_path: finetuned unet model path
--image1_path: start frame path
--image2_path: end frame path
--output_dir: folder path to save the results
--control_weight: frame-wise condition control weight, default is 1.0
--num_inference_steps: diffusion denoise steps, default is 25
--height : input frames height, default is 576
--width: input frames width, default is 1024
โจ Datasets
You can download our test dataset here.
๐๏ธ Citation
@article{zhu2024generative,
title={Generative Inbetweening through Frame-wise Conditions-Driven Video Generation},
author={Zhu, Tianyi and Ren, Dongwei and Wang, Qilong and Wu, Xiaohe and Zuo, Wangmeng},
journal={arXiv preprint arXiv:2412.11755},
year={2024}
}
๐ Acknowledgements
Thanks for the work of ControlNeXt, svd_keyframe_interpolation, GlueStick, DWPose. Our code is based on the implementation of them.