README.md
December 31, 2025 · View on GitHub
Test-Time Visual In-Context Tuning
We present VICT, a test-time visual in-context tuning method that can adapt visual in-context learning models on the fly with a single test sample. VICT can be applied to a wide range of unseen domains and tasks at test time.
:open_book: For more results, please refer to our paper
📣 News
- [03/2025] 🔥 VICT is released on arXiv.
🌟 Method
VICT is a simple yet effective test-time training approach to adapt visual in-context learning (VICL) models on the fly. The motivation is that each test input offers a hint about the test distribution. Thus, we modify a VICL model at test time to make full use of this hint by setting up a one-sample learning problem.
Specifically, we flip the role between the task prompts and the test sample and use a cycle consistency self-supervised loss to reconstruct the original task prompt output. Our key insight is that a model should be aware of a new test distribution if it can successfully recover the original task prompts.
🤗 Qualitative Examples
Unseen Domains
Middle-/High-Level Tasks with Corruptions
Low-Level Tasks with Corruptions
Unseen Tasks
🛠️ Usage
Installation
See installation instructions.
Data
See data instructions.
Training
Evaluation
👨💻 Todo
- Release the arXiv version.
- Release the code.
📘 Citation
If you find this work useful for your research, please consider citing our paper:
@inproceedings{xie2025test,
title = {Test-Time Visual In-Context Tuning},
author = {Xie, Jiahao and Tonioni, Alessio and Rauschmayr, Nathalie and Tombari, Federico and Schiele, Bernt},
booktitle={CVPR},
year = {2025}
}
❤️ Acknowledgement
We acknowledge the use of the following public code in this project: Painter, MAE, BEiT, detectron2, Mask2Former, bts, mmcv, mmdetetection, mmpose, MIRNet, MPRNet, and Uformer.