Semantic Correspondence: Unified Benchmarking and a Strong Baseline
December 11, 2025 · View on GitHub
Kaiyan Zhang1, Xinghui Li, Jingyi Lu1, Kai Han1
1Visual AI Lab, The University of Hong Kong
Paper List
We provide a paper list for all the semantic correspondence estimation methods discussed in the paper.
Meanwhile, we also created a repo, Awesome-Semantic-Correspondence, to collect all papers for semantic correspondence estimation, considering the growing body of the literature in the field. PRs are wellcome!
Environment
The environment can be easily installed through conda and pip. After downloading the code, run the following command:
$conda create -n sc_baseline python=3.10
$conda activate sc_baseline
$conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
$conda install xformers -c xformers
$pip install yacs pandas scipy einops matplotlib triton timm diffusers accelerate transformers datasets tensorboard pykeops scikit-learn
Data
Download the dataset you need under the 'asset' folder.
PF-Pascal
- Download PF-Pascal dataset from link.
- Rename the outermost directory from
PF-dataset-PASCALtopf-pascal. - Download lists for image pairs from link.
- Place the lists for image pairs under
pf-pascaldirectory.
PF-Willow
- Download PF-Willow dataset from the link.
- Rename the outermost directory from
PF-datasettopf-willow. - Download lists for image pairs from link.
- Place the lists for image pairs under
pf-willowdirectory.
SPair-71k
Download SPair-71k dataset from link. After extraction, No more action required.
AP-10k
Follow the instrcution of GeoAware-SC to prepare for the AP-10k dataset.
The structure should be :
asset
├── ap-10k
│ ├── annotations
│ ├── ImageAnnotation
│ ├── JPEGImages
│ ├── PairAnnotation
├── pf-pascal
│ ├── PF-dataset-PASCAL
│ │ ├── test_pairs.csv
│ │ ├── trn_pairs.csv
│ │ └── val_pairs.csv
├── pf-willow
│ ├── PF-dataset
│ │ └── test_pairs.csv
└── SPair-71k
├── devkit
├── ImageAnnotation
├── JPEGImages
├── Layout
├── PairAnnotation
├── Segmentation
└── Visualization
Training
The configuration file for training and testing can be access at config/base.py. For example, to train the model, run:
sh train.sh
Some important parameters here include:
dataset: dataset name, choose from 'spair', 'ap10k', 'pfwillow' or 'pfpascal'.method: set to 'dino' to use DINOv2 as the backbone.pre_extract: pre-extract image features to speed up validation.train_sampleandval_sample: only used for the AP-10k dataset.`save_thre: threshold for saving the model within an epoch.eval_interval: iteration interval for validation.ckpt_dir: directory to save the model, train log and evaluation log.resume_dir: directory to resume training. If starting from scratch, set to 'None'.
Testing
python test.py --dataset ap10k --method dino --resolution 840 --batch_size 4 --ckpt_dir $directory_of_the_model$
We provided pretrained weights to reproduce the results in the paper, you can download it here.
| SPair-71k | AP-10k | |||
|---|---|---|---|---|
| Ours(DINOv2) | 85.1% | Google Drive | 87.4% | Google Drive |
Citation
@article{zhang2025semantic,
title={Semantic Correspondence: Unified Benchmarking and a Strong Baseline},
author={Kaiyan Zhang and Xinghui Li and Jingyi Lu and Kai Han},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2025}
}