README.md

March 17, 2026 · View on GitHub

Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification (ICLR 2026)

Siyi Du, Xinzhe Luo, Declan P. O'Regan, and Chen Qin

GitHub stars

DyMo

(a-b) Evidence of the discarding-imputation dilemma: (a-1) vs. (a-2) recovery-free methods (e.g., ModDrop) learn less discriminative features because they ignore highly task-relevant missing modalities {M,T}; (b) recovery-based methods (e.g., MoPoE) generate unreliable reconstructions, e.g., low-fidelity (orange) or misaligned (yellow). (c) Our DyMo, which addresses the dilemma by dynamically fusing task-relevant recovered modalities, improving accuracy by 1.61% on PolyMNIST, 1.68% on MST, and 3.88% on CelebA (Tab 1).

This repository provides the official PyTorch implementation of Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification.

In addition to our DyMo, we include implementations of multiple baseline and comparison models, such as M3Care, MUSE, MTL, MAP, PDF, QMF, and DynMM. Please refer to the paper for detailed descriptions of these models.

Contact: s.du23@imperial.ac.uk (Siyi Du)

If you find this repository helpful, please consider giving it a :star:.

Updates

[2026-02-08] The arXiv paper and the code are released.

[2026-02-11] Upload model weights to Hugging Face.

Our Multimodal Learning Research Line

This repository is part of our research line on multimodal learning.

TIP (ECCV2024): An image-tabular pre-training framework for intra-modality missingness (siyi-wind/TIP)
STiL (CVPR 2025): A semi-supervised image-tabular framework for modality heterogeneity and limited labeled data (siyi-wind/STiL)
DyMo (ICLR 2026, this work): An inference-time dynamic modality selection framework for missing modality (siyi-wind/DyMo)

1 Requirements
2 Preparation
3 Training & Testing
4 Checkpoints
5 Licence & Citation
6 Acknowledgements

1 Requirements

This codebase is implemented with Python 3.9.15, PyTorch 1.11.0, CUDA 11.3.1, and cuDNN 8.

cd DyMo/
conda env create --file environment.yaml
conda activate dymo
pip install numpy==1.23.5

2 Preparation

2.1 Data Downloading

We conduct experiments on five multimodal datasets with diverse modality combinations:

Dataset	Classification Task	#Modality	Modality Type	#Train	#Val	#Test	#Class
PolyMNIST	Digit	5	RGB image	60,000	3,000	7,000	10
MST	Digit	3	RGB image, Text	1,121,360	60,000	140,000	10
CelebA	Face attribute	2	RGB image, Text	162,770	19,962	19,867	2
DVM	Car model	2	RGB image, Table	70,565	17,642	88,207	283
UKBB	Coronary artery disease (CAD)	2	MR image, Table	3,482	6,510	3,617	2
UKBB	Myocardial infarction	2	MR image, Table	1,552	6,510	3,617	2

The preparation of PolyMNIST, MNIST-SVHN-TEXT(MST), and CelebA follows https://github.com/thomassutter/MoPoE.
Download the DVM dataset from here.
Apply for UKBB access here (Note that UKBB is semi-public and requires approval and access fees).

2.2 Data Pre-processing

For UKBB and DVM, we conduct the same preprocessing pipelines as in siyi-wind/TIP.

2.3 Modality Reconstructor Preparation

We evaluate DyMo with five modality recovery methods: MoPoE, MMVAE++, CMVAE, TIP, and Iterative Multivariate Imputer (IMI).

Dataset	Modality Recovery Model	Download Weights
PolyMNIST	MoPoE, MMVAE++, CMVAE	Folder
MST	MoPoE	Folder
CelebA	MoPoE	Folder
DVM	TIP, IMI	TIP
UKBB	TIP, IMI	TIP

TIP weights are from our ECCV 2024 paper siyiwind/TIP, while other models are trained by ourselves. For IMI, we directly provide the imputed DVM tables in datasets/IMI_imputed_data/DVM. Due to UKBB data policy, imputed UKBB tables are not released, but can be generated using datasets/tabular_imputation_UKBB.ipynb when you have the access to the UKBB dataset.

cd DyMo/job_scripts
env CUDA_VISIBLE_DEVICES=0 bash CelebA_DynamicTransformer.sh

Trained model weights are provided in 4 Checkpoints.

Step 2: Generate Gaussian Parameters for The ICS score

After completing the training of DynamicTransformer, we need to generate and store class-wise means and variances for calculating the ICS score (Eq. 8 in the paper). Use gaussian_parameter_generation/gaussian_COS.ipynb and gaussian_parameter_generation/gaussian_EU.ipynb to generate gaussian parameters for euclidean distance and cosine distance, separately.

Precomputed parameters are also available in 4 Checkpoints.

Step 3: Run DyMo at the Test Dataset

DyMo is an inference-time method and is computationally efficient:

cd DyMo/job_scripts
env CUDA_VISIBLE_DEVICES=0 bash PolyMNIST_DyMo.sh

Other Models

To train or evaluate other baseline models, modify the model name in the shell scripts under job_scripts/. Available model configurations are listed in configs/model/.

4 Checkpoints

Dataset	Download Model Checkpoints and Gaussian Parameters
PolyMNIST	Download
MST	Download
CelebA	Download
DVM	Download
CAD	Download
Infarction	Download

5 Licence & Citation

This repository is licensed under the Apache License, Version 2.

If you find this work useful, please cite:

@inproceedings{du2026dymo,
  title={Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification},
  author={Du, Siyi and Luo, Xinzhe and O'Regan, Declan P. and Qin, Chen},
  booktitle={International Conference on Learning Representations (ICLR) 2026},
  year={2026}}

6 Acknowledgements

We would like to thank the following repositories for their valuable contributions:

MMCL
MoPoE