EMNLP-2023-Papers
January 1, 2024 ยท View on GitHub
Language Grounding to Vision, Robotics and Beyond
| Title | Repo | Paper | Video |
|---|---|---|---|
| Models See Hallucinations: Evaluating the Factuality in Video Captioning | :heavy_minus_sign: | ||
| Describe Me an Auklet: Generating Grounded Perceptual Category Descriptions | :heavy_minus_sign: | ||
| Reading Books is Great, but not if You are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms | :heavy_minus_sign: | ||
| Bridging the Digital Divide: Performance Variation across Socio-Economic Factors in Vision-Language Models | :heavy_minus_sign: | ||
| 3DRP-Net: 3D Relative Position-Aware Network for 3D Visual Grounding | :heavy_minus_sign: | :heavy_minus_sign: | |
| Localizing Active Objects from Egocentric Vision with Symbolic World Knowledge |