Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization

October 25, 2022 ยท View on GitHub

Peter Schaldenbrand, Zhixuan Liu, and Jean Oh. The Robotics Institute, Carnegie Mellon University

Demo on Replicate
Open In Colab
Read our paper on ArXiv

An approach to generating videos based on a series of given language descriptions of the video. We currently only have a Colab implementation which is linked above.

A ballerina frog dancing

Please message Peter at pschalde at andrew dot cmu dot edu with any questions or make a GitHub issue. Thanks!