Semantic Correspondence: Unified Benchmarking and a Strong Baseline

December 11, 2025 · View on GitHub

TPAMI 2025

Kaiyan Zhang¹, Xinghui Li, Jingyi Lu¹, Kai Han¹

¹Visual AI Lab, The University of Hong Kong

Paper List

We provide a paper list for all the semantic correspondence estimation methods discussed in the paper.

Meanwhile, we also created a repo, Awesome-Semantic-Correspondence, to collect all papers for semantic correspondence estimation, considering the growing body of the literature in the field. PRs are wellcome!

Environment

The environment can be easily installed through conda and pip. After downloading the code, run the following command:

$conda create -n sc_baseline python=3.10
$conda activate sc_baseline

$conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
$conda install xformers -c xformers
$pip install yacs pandas scipy einops matplotlib triton timm diffusers accelerate transformers datasets tensorboard pykeops scikit-learn

Data

Download the dataset you need under the 'asset' folder.

PF-Pascal

Download PF-Pascal dataset from link.
Rename the outermost directory from PF-dataset-PASCAL to pf-pascal.
Download lists for image pairs from link.
Place the lists for image pairs under pf-pascal directory.

PF-Willow

Download PF-Willow dataset from the link.
Rename the outermost directory from PF-dataset to pf-willow.
Download lists for image pairs from link.
Place the lists for image pairs under pf-willow directory.

SPair-71k

Download SPair-71k dataset from link. After extraction, No more action required.

AP-10k

Follow the instrcution of GeoAware-SC to prepare for the AP-10k dataset.

The structure should be :

asset
├── ap-10k
│   ├── annotations
│   ├── ImageAnnotation
│   ├── JPEGImages
│   ├── PairAnnotation
├── pf-pascal
│   ├── PF-dataset-PASCAL
│   │   ├── test_pairs.csv
│   │   ├── trn_pairs.csv
│   │   └── val_pairs.csv
├── pf-willow
│   ├── PF-dataset
│   │   └── test_pairs.csv
└── SPair-71k
    ├── devkit
    ├── ImageAnnotation
    ├── JPEGImages
    ├── Layout
    ├── PairAnnotation
    ├── Segmentation
    └── Visualization

Training

The configuration file for training and testing can be access at config/base.py. For example, to train the model, run:

sh train.sh

Some important parameters here include:

dataset: dataset name, choose from 'spair', 'ap10k', 'pfwillow' or 'pfpascal'.
method: set to 'dino' to use DINOv2 as the backbone.
pre_extract: pre-extract image features to speed up validation.
train_sample and val_sample: only used for the AP-10k dataset.`
save_thre: threshold for saving the model within an epoch.
eval_interval: iteration interval for validation.
ckpt_dir: directory to save the model, train log and evaluation log.
resume_dir: directory to resume training. If starting from scratch, set to 'None'.

Testing

python test.py --dataset ap10k  --method dino --resolution 840  --batch_size 4 --ckpt_dir $directory_of_the_model$

We provided pretrained weights to reproduce the results in the paper, you can download it here.

	SPair-71k		AP-10k
Ours(DINOv2)	85.1%	Google Drive	87.4%	Google Drive

Citation

@article{zhang2025semantic,
      title={Semantic Correspondence: Unified Benchmarking and a Strong Baseline},
      author={Kaiyan Zhang and Xinghui Li and Jingyi Lu and Kai Han},
      journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
      year={2025}
}