Uncover Underlying Correspondence for Robust Multi-view Clustering

March 5, 2026 ยท View on GitHub

This repository provides the official implementation of our paper:

Haochen Zhou, Guofeng Ding, Mouxing Yang, Peng Hu, Yijie Lin, Xi Peng,
Uncover Underlying Correspondence for Robust Multi-view Clustering, ICLR 2026 (Oral). ๐Ÿ‘‰ [Paper]

Introduction

  • Given that real-world multi-view data widely suffer from the Noisy Correspondence (NC) problem, this work explores a novel path for multi-view learning.
  • Unlike existing methods that rely heavily on pre-defined (and potentially noisy) cross-view pairs, we assume the underlying correspondences are unknown a priori and reformulate multi-view learning as a maximum likelihood estimation problem over the underlying cross-view correspondences. This objective is elegantly solved via our EM-based algorithm, CorreGen. Through CorreGen, the model alternately uncovers the latent soft correspondence distributions and robustly optimizes the representation learning network.
  • CorreGen not only theoretically unifies and generalizes the classic InfoNCE loss but also achieves SOTA performance across various complex noise scenarios, providing a fresh perspective for robust multi-view learning.

framework

Requirements

This repository depends on the following core libraries: PyTorch, NumPy, scikit-learn, SciPy, munkres. Recommended package versions are specified in requirements.txt. You can install them by running:

pip install -r requirements.txt

Configuration

The hyperparameters and training options for the four datasets used in the paper are provided as YAML configuration files in the /config directory.

Datasets

The /dataset directory provides the Scene15 and LandUse21 datasets. All datasets used in this project can be downloaded from Google Drive: Google Drive.

Usage

After cloning this repository, navigate to the project directory and run:

python main_train.py --config_file Scene15.yaml

By default, the Mismatch Ratio (MR) and Corruption Ratio (CR) are set to 0. You can specify different values via command-line arguments, for example:

python main_train.py --config_file Scene15.yaml --m_ratio 0.2 --c_ratio 0.2

Note that arguments defined in --config_file will override other command-line arguments.

The training results are recorded in output/{training dataset}/log_train.txt.

Citation

If you find this repository useful in your research, please consider citing:

@inproceedings{zhou2026uncover,
  title={Uncover Underlying Correspondence for Robust Multi-view Clustering},
  author={Zhou, Haochen and Ding, Guofeng and Yang, Mouxing and Hu, Peng and Lin, Yijie and Peng, Xi},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026}
}

Acknowledgement

This implementation is based on DIVIDE.