MoRE: MultiModal Contrastive Pretraining of X-ray, ECG, and Report
January 20, 2025 ยท View on GitHub
Please Cite this work as:
@article{thapa2024more, title={MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report}, author={Thapa, Samrajya and Howlader, Koushik and Bhattacharjee, Subhankar and others}, journal={arXiv preprint arXiv:2410.16239}, year={2024} }
MoRE: MultiModal Contrastive Pretraining of X-ray, ECG, and Report

MoRE is a pretraining framework which synergestically aligns Xray, ECG, and Diagnostic Report of same patient with Contrastive Learning. The Clinical Report (Cardiology Report and Radiology Report) are combined together and acts an anchor to align the Xray and ECG in a multimodal space, we show this via Multi-Modal Retrieval by retrieving Xray and ECG data via a single text query (refer to Section 4.6.3 MultiModal Retrieval in Paper), we also adapt TransLRP to show multimodal attention visualization to provide explanation of multimodal input for diagnosis (refer to section 4.6.3 Gradient Based LRP attention visualization). MoRE beats baseline GLoRIA, MedKLIP in Mimic IV Xray dataset on 4 labels (Atelectasis, Cardiomegaly, Edema, Effusion) and beats baselines in PtbXL ECG dataset for superclass labels. MoRE outperforms its baselines in Zero-shot classification as well showcasing its strong representation learning capability. MoRE also utilizes PEFT LoRA strategy to fine-tune the LLM during pre-training effectively only training 0.6% of original parameters of the LLM significantly reducing training time.
Setting up the Environment
-
Create a virtual environment:
python -m venv myenv -
Install the required dependencies:
pip install -r requirements.txt
Pre-Train MoRE
-
Download the required datasets from Physionet (datasets are not attached due to credential requirements for data signing).
-
Add the datasets to the appropriate folder.
-
Preprocess the data (preprocessing code is included).
-
Run the pretraining script:
python pretrain_multimodel.pyAdd arguments as needed; default settings are provided.
Fine-tune in Mimic/Chexpert
-
Ensure that the pre-trained model is saved.
-
Run the fine-tuning script:
python multimodal_infer.pyMake sure to change the data paths and model paths as needed.
Zero-Shot Classification
- Run the zero-shot classification script:
Update data paths or parameters as necessary.python zero_shot_xray/ecg_more.py
Retrieval Tasks
-
Check the
xray_ecg_retrieval.ipynbnotebook for an example of multimodal retrieval. -
Run the X-ray retrieval script:
python xray_retrieval.py
t-SNE Plot
- Check the
tnse_plot.ipynbnotebook for an example of a t-SNE plot of features.
Model Weights
Link: https://drive.google.com/file/d/1BB9dT6iYihqJarD5qX0bdnfYhiwhBgmH/view?usp=share_link Change layer names, drop any weights if extra as needed through pytorch