[CVPR 2026] Rethinking Token Reduction for Large Vision-Language Models

March 21, 2026 · View on GitHub

Overview

This is the official code implementation of our CVPR 2026 paper "Rethinking Token Reduction for Large Vision-Language Models". We propose a novel learning-based, prompt-agnostic token compression method tailored for Large Vision-Language Models (LVLMs) in multi-turn Visual Question Answering (MT-VQA) scenarios.

Status

⌛️ Code Release Update: The code implementation is currently being organized and will be released as soon as possible.