| Unsupervised Feature Representation Learning for Domain-Generalized Cross-Domain Image Retrieval |  |  | :heavy_minus_sign: |
| DEDRIFT: Robust Similarity Search under Content Drift | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Global Features are All You Need for Image Retrieval and Reranking |  |  | :heavy_minus_sign: |
| HSE: Hybrid Species Embedding for Deep Metric Learning | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Discrepant and Multi-Instance Proxies for Unsupervised Person Re-Identification | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Towards Grand Unified Representation Learning for Unsupervised Visible-Infrared Person Re-Identification |  |  | :heavy_minus_sign: |
| EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition |  |  | :heavy_minus_sign: |
| Simple Baselines for Interactive Video Retrieval with Questions and Answers |  |  | :heavy_minus_sign: |
| Fan-Beam Binarization Difference Projection (FB-BDP): A Novel Local Object Descriptor for Fine-Grained Leaf Image Retrieval | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Conditional Cross Attention Network for Multi-Space Embedding without Entanglement in Only a SINGLE Network | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Learning Concordant Attention via Target-Aware Alignment for Visible-Infrared Person Re-Identification | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Person Re-Identification without Identification via Event Anonymization |  |  | :heavy_minus_sign: |
| Divide&Classify: Fine-Grained Classification for City-Wide Visual Geo-Localization |  |  | :heavy_minus_sign: |
| Dark Side Augmentation: Generating Diverse Night Examples for Metric Learning |  |  |  |
| PIDRo: Parallel Isomeric Attention with Dynamic Routing for Text-Video Retrieval | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Unified Pre-Training with Pseudo Texts for Text-to-Image Person Re-Identification |  |  | :heavy_minus_sign: |
| Modality Unifying Network for Visible-Infrared Person Re-Identification | :heavy_minus_sign: |  | :heavy_minus_sign: |
| DeepChange: A Long-Term Person Re-Identification Benchmark with Clothes Change |  |  | :heavy_minus_sign: |
| LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval |  |  | :heavy_minus_sign: |
| Dual Pseudo-Labels Interactive Self-Training for Semi-Supervised Visible-Infrared Person Re-Identification |  |  | :heavy_minus_sign: |
| BT2: Backward-Compatible Training with Basis Transformation |  |  | :heavy_minus_sign: |
| Prototypical Mixing and Retrieval-based Refinement for Label Noise-Resistant Image Retrieval | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Learning Spatial-Context-Aware Global Visual Feature Representation for Instance Image Retrieval |  |  | :heavy_minus_sign: |
| Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval |  |  | :heavy_minus_sign: |
| Visible-Infrared Person Re-Identification via Semantic Alignment and Affinity Inference |  |  | :heavy_minus_sign: |
| Part-Aware Transformer for Generalizable Person Re-Identification |  |  | :heavy_minus_sign: |
| Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations |  |  | :heavy_minus_sign: |
| Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval | :heavy_minus_sign: |  | :heavy_minus_sign: |
| Fine-Grained Unsupervised Domain Adaptation for Gait Recognition | :heavy_minus_sign: |  |  |
| FashionNTM: Multi-Turn Fashion Image Retrieval via Cascaded Memory |  |  | :heavy_minus_sign: |
| CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition |  |  | :heavy_minus_sign: |