Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization
October 25, 2022 ยท View on GitHub
Peter Schaldenbrand, Zhixuan Liu, and Jean Oh. The Robotics Institute, Carnegie Mellon University
An approach to generating videos based on a series of given language descriptions of the video. We currently only have a Colab implementation which is linked above.

Please message Peter at pschalde at andrew dot cmu dot edu with any questions or make a GitHub issue. Thanks!
Read our paper on ArXiv