ReVideo: Remake a Video with Motion and Content Control

September 26, 2024 · View on GitHub

Chong Mou, Mingdeng Cao, Xintao Wang, Zhaoyang Zhang, Ying Shan, Jian Zhang

Introduction

ReVideo aims to solve the problem of local video editing. The editing target includes visual content and motion trajectory modifications.

📰 New Features/Updates

[2024/09/25] ReVideo is accepted by NeurIPS 2024.
[2024/06/26] We release the code of ReVideo.
[2024/05/26] Long video editing plan: We are collaborating with Open-Sora Plan team to replace SVD with Sora framework, making ReVideo suitable for long video editing. Here are some preliminary results. This initial combination is still limited in quality for long videos. In the future, we will continue to cooperate and launch high-quality long video editing models.

Generated by Open-Sora	Editing Result

- [2024/05/23] Paper and project page of **ReVideo** are available.

✏️ Todo

Code will be open-sourced in June

🔥🔥🔥 Main Features

Change content & Customize motion trajectoy

Input Video	Editing Result

Change content & Keep motion trajectoy

Input Video	Editing Result

Keep content & Customize motion trajectoy

Input Video	Editing Result

Multi-area Editing

Input Video	Editing Result

🔧 Dependencies and Installation

Python >= 3.8 (Recommend to use Anaconda or Miniconda)
PyTorch >= 2.0.1

pip install -r requirements.txt

⏬ Download Models

All models will be automatically downloaded. You can also choose to download manually from this url.

Since our ReVideo is trained based on Stable Video Diffusion, the usage guidelines for the model should follow the Stable Video Diffusion's NC-COMMUNITY LICENSE!

💻 How to Test

You can download the testset from https://huggingface.co/Adapter/ReVideo. Inference requires at least 20GB of GPU memory for editing a 768x1344 video.

bash configs/examples/constant_motion/head6.sh

Description of input parameters

--s_h  # The abscissa of the top left corner of the editing region
--e_h # The abscissa of the lower right corner of the editing region
--s_w # The ordinate of the top left corner of the editing region
--e_w # The ordinate of the lower right corner of the editing region
--ps_h # The abscissa of the start point
--pe_h # The abscissa of the end point
--ps_w # The ordinate of the start point
--pe_w # The ordinate of the end point
--x_bias_all # Horizontal offset of reciprocating motion
--y_bias_all # Vertical offset of reciprocating motion

[1] https://pika.art/

[2] DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory

[3] DragAnything: Motion Control for Anything using Entity Representation

[4] AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks

🤗 Acknowledgements

We appreciate the releasing code of Stable Video Diffusion.