MSVP
July 26, 2025 · View on GitHub
**[ICCV 2025]**Official PyTorch implementation of "Incremental Few-Shot Semantic Segmentation via Multi-Level Switchable Visual Prompts"
Please visit our Official Website if you are in Chinese Mainland.
[Paper] | [Arxiv] | [Supplementary]
:fire: Highlight
- An innovative VLM-based and prompt-based IFSS framework:We propose the first prompt-based IFSS framework, which introduces textual semantics and visual prompts to encode foreground and background classes separately, enabling incremental semantic segmentation.
- Multi-level switchable visual prompts: We propose multi-level switchable visual prompts that customizes multi-granular knowledge tailored to input images, enhancing the model's ability to learn novel classes while maintaining knowledge of old classes.
- A new SOTA performance: Extensive experiments demonstrate the effectiveness of the proposed method. Under the 1-shot condition, it achieves 49.1% mIoU-N on VOC and 25.6% mIoU-N on COCO, setting a new SOTA performance.
:crown: Overview
Images are inputs into the query function to obtain global query features and pixel-wise query features . Stage-specific prompts are generated by an attention-like integration way through . Region-unique prompts are generated by nearest neighbor matching through clustered .
Image tokens, concatenated with these selected prompts, are input into pre-trained models to produce predictions. At incremental training stage, the model extends by fine-tuning novelly-added stage-specific prompts and region-unique prompts.
:art: Code
Coming soon.