RGVI

June 2, 2025 · View on GitHub

This is the official PyTorch implementation of our paper:

Elevating Flow-Guided Video Inpainting with Reference Generation, AAAI 2025
Suhwan Cho, Seoung Wug Oh, Sangyoun Lee, Joon-Young Lee
Link: [AAAI] [arXiv]

You can also explore other related works at awesome-video-inpainting.

Demo Video

https://github.com/user-attachments/assets/5654ae19-4c8b-458a-872c-9cdf6bedf699

Existing VI approaches face challenges due to the inherent ambiguity between known content propagation and new content generation. To address this, we propose a robust VI framework that integrates a large generative model to decouple this ambiguity. To further improve pixel distribution across frames, we introduce an advanced pixel propagation protocol named one-shot pulling. Furthermore, we present the HQVI benchmark, a dataset specifically designed to evaluate VI performance in diverse and realistic scenarios.

Setup

1. Download YouTube-VOS from the official website (necessary only for PFCNet training).

2. For convenience, I also provide the pre-processed version (resized to 240p).

3. Download HQVI and DAVIS to evaluate video object removal performance.

4. Download DAVI and YTVI to evaluate video restoration performance.

5. Download FCNet and I3D weights and place them in the weights/ directory.

Running

Training

Train PFCNet on the YouTube-VOS dataset using the conventional random masking strategy.

PFCNet does not significantly impact RGVI's stability, so you are free to use any custom network of your choice.

Testing

Run RGVI with:

python run.py

Verify the following before running:
✅ Testing dataset selection
✅ GPU availability and configuration
✅ Input resolution selection
✅ Text prompt for generation mode
✅ Pre-trained model path

Run the evaluation code with:

python post.py

Verify the following before running:
✅ Dataset root specification
✅ Input resolution selection

Attachments

Pre-trained model (PFCNet)
Pre-computed results (STTN)
Pre-computed results (FGVC)
Pre-computed results (FuseFormer)
Pre-computed results (E2FGVI)
Pre-computed results (ProPainter)
Pre-computed results (RGVI)

Contact

Code and models are only available for non-commercial research purposes.
For questions or inquiries, feel free to contact:

E-mail: suhwanx@gmail.com

License

The files evaluator.py, pfcnet.py, post.py, rgvi.py, run.py, and other related materials including model checkpoints and experimental data, are licensed under the CC BY-NC License.

The files fcnet.py and i3d.py are from ProPainter and are licensed under the NTU S-Lab License 1.0.