E3M: Zero-Shot Spatio-Temporal Video Grounding

July 16, 2024 ยท View on GitHub

[ECCV 2024] Zero-Shot Spatio-Temporal Video Grounding with Expectation-Maximization Multimodal Modulation