Generative Inbetweening through Frame-wise Conditions-Driven Video Generation

February 27, 2025 ยท View on GitHub

Tianyi Zhu, Dongwei Ren, Qilong Wang, Xiaohe Wu, Wangmeng Zuo

This repository is the official PyTorch implementation of "Generative Inbetweening through Frame-wise Conditions-Driven Video Generation".

arXiv Project Page

๐ŸŽ‰ Our paper is accepted by CVPR 2025

๐Ÿ–ผ๏ธ Results

Input starting frame Input ending frame Inbetweening results

โš™๏ธ Run inference demo

1. Setup environment

git clone https://github.com/Tian-one/FCVG.git
cd FCVG
conda create -n FCVG python=3.10.14
conda activate FCVG
pip install -r requirements.txt

2. Download models

  1. Download the Gluestick weights and put them in './models/resources'.

    wget https://github.com/cvg/GlueStick/releases/download/v0.1_arxiv/checkpoint_GlueStick_MD.tar -P models/resources/weights
    
  2. Download the DWPose pretrained weights dw-ll_ucoco_384.onnx and yolox_l.onnx here, then put them in './checkpoints/dwpose'.

  3. Download our FCVG model here, put them in './checkpoints'

3. Run the inference script

Run inference with default setting:

bash demo.sh

or run

python demo_FCVG.py 

--pretrained_model_name_or_path: pretrained SVD model folder, we fintune models based on SVD-XT1.1
--controlnext_path: ControlNeXt model path
--unet_path: finetuned unet model path
--image1_path: start frame path
--image2_path: end frame path
--output_dir: folder path to save the results
--control_weight: frame-wise condition control weight, default is 1.0
--num_inference_steps: diffusion denoise steps, default is 25
--height : input frames height, default is 576
--width: input frames width, default is 1024

โœจ Datasets

You can download our test dataset here.

๐Ÿ–Š๏ธ Citation

@article{zhu2024generative,
  title={Generative Inbetweening through Frame-wise Conditions-Driven Video Generation},
  author={Zhu, Tianyi and Ren, Dongwei and Wang, Qilong and Wu, Xiaohe and Zuo, Wangmeng},
  journal={arXiv preprint arXiv:2412.11755},
  year={2024}
}

๐Ÿ’ž Acknowledgements

Thanks for the work of ControlNeXt, svd_keyframe_interpolation, GlueStick, DWPose. Our code is based on the implementation of them.