UNI

March 26, 2025 · View on GitHub

Towards a General-Purpose Foundation Model for Computational Pathology

Nature Medicine

Journal Link | Open Access Read Link | Download Models | Download Pre-extracted Embeddings | Cite

Updates

Unfamiliar with UNI? Please refer to the original README (here) for more details or refer to the accompanying Nature Medicine study (here).

Model weights

Model NameRelease DateModel ArchitectureDownload Link
UNI2-h01/2025ViT-h/14-reg8HF Link
UNI03/2024ViT-l/16HF Link

Research Applications using UNI & CONCH

Last Updated 3/20/2025
Paper NameYearPublication
A self-supervised framework for learning whole slide representations2024arXiv:2402.06188
Honeybee: a scalable modular framework for creating multimodal oncology datasets with foundational embedding models2024arXiv:2405.07460
Combining graph neural network and mamba to capture local and global tissue spatial relationships in whole slide images2024arXiv:2406.04377
STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics2024arXiv:2406.06393
Embedding-based multimodal learning on pan-squamous cell carcinomas for improved survival outcomes2024arXiv:2406.08521
A clinical benchmark of public self-supervised pathology foundation models2024arXiv:2407.06508v1
Path-SAM2: Transfer SAM2 for digital pathology semantic segmentation2024arXiv:2408.03651
Benchmarking foundation models as feature extractors for weakly-supervised computational pathology2024arXiv:2408.15823
Pediatric brain tumor classification using digital histopathology and deep learning: evaluation of SOTA methods on a multi-center Swedish cohort2024arXiv:2409.01330
Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval2024arXiv:2409.09430
Evaluating Deep Regression Models for WSI-Based Gene-Expression Prediction2024arXiv:2410.00945
Deep Learning for Fetal Inflammatory Response Diagnosis in the Umbilical Cord2024arXiv:2411.09767
Diagnostic Text-guided Representation Learning in Hierarchical Classification for Pathological Whole Slide Image2024arXiv:2411.10709
Leveraging Computational Pathology AI for Noninvasive Optical Imaging Analysis Without Retraining2024arXiv:2411.11613
FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification2024arXiv:2411.14743
RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency2024arXiv:2411.15076
ST-Align: A Multimodal Foundation Model for Image-Gene Alignment in Spatial Transcriptomics2024arXiv:2411.16793
Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology2024arXiv:2411.17418
Multimodal whole slide foundation model for pathology2024arXiv:2411.19666
GCUNet: A GNN-Based Contextual Learning Network for Tertiary Lymphoid Structure Semantic Segmentation in Whole Slide Image2024arXiv:2412.06129
A multimodal ensemble approach for clear cell renal cell carcinoma treatment outcome prediction2024arXiv:2412.07136
From Histopathology Images to Cell Clouds: Learning Slide Representations with Hierarchical Cell Transformer2024arXiv:2412.16715
Vision-language models do not understand negation2025arXiv:2501.09425
Prior Knowledge Injection into Deep Learning Models Predicting Gene Expression from Whole Slide Images2025arXiv:2501.14056
Molecular-driven Foundation Model for Oncologic Pathology2025arXiv:2501.16652
Dynamic Hypergraph Representation for Bone Metastasis Cancer Analysis2025arXiv:2501.16787
Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions2025arXiv:2502.19293
DELST: Dual Entailment Learning for Hyperbolic Image-Gene Pretraining in Spatial Transcriptomics2025arXiv:2503.00804
Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion2025arXiv:2503.00925
CrossFusion: A Multi-Scale Cross-Attention Convolutional Fusion Model for Cancer Survival Prediction2025arXiv:2503.02064
Adaptive Prototype Learning for Multimodal Cancer Survival Analysis2025arXiv:2503.04643
ecPath detects ecDNA in tumors from histopathology images2024bioRxiv:2024.11.13.623494v1
Contrastive Learning for Omics-guided Whole-slide Visual Embedding Representation2025bioRxiv:2025.01.12.632280
Multi-modal Disentanglement of Spatial Transcriptomics and Histopathology Imaging2025bioRxiv:2025.02.19.638201v1
High-Parameter Spatial Multi-Omics through Histology-Anchored Integration2025bioRxiv:2025.02.23.639721v1
Weakly-supervised deep learning models enable HER2-low prediction from H&E stained slides2024Breast Cancer Research
2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image2025Computer Vision & Pattern Recognition (CVPR)
Transcriptomics-guided slide representation learning in computational pathology2024Computer Vision & Pattern Recognition (CVPR)
Morphological prototyping for unsupervised slide representation learning in computational pathology2024Computer Vision & Pattern Recognition (CVPR)
Development and validation of novel deep learning-based models for cancer histopathology image2024Doctoral dissertation (Karolinska Institutet)
Multistain pretraining for slide representation learning in pathology2024European Conference on Computer Vision (ICCV)
Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology2025International Conference on Learning Representations (ICLR)
Multimodal prototyping for cancer survival prediction2024International Conference on Machine Learning (ICML)
High-resolution spatial transcriptomics from histology images using histosge2024International Conference on Bioinformatics and Biomedicine (BIBM)
Multi-resolution histopathology patch graphs for ovarian cancer subtyping2024International Workshop on Graphs in Biomedical Image Analysis
Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models2025International Symposium on Biomedical Imaging (ISBI)
1250 H&E-based cell prediction multi-classification models to capture morphologically distinct subpopulations of CD8+ T cells2024Journal for ImmunoTherapy of Cancer
Liver fibrosis classification on trichrome histology slides using weakly supervised learning in children and young adults2025Journal of Pathology Informatics
Winners of the 2024 Tuberculosis Detection Competition2024LinkedIn post
Model-based cleaning of the QUILT-1M pathology dataset for text-conditional image synthesis2024Medical Imaging with Deep Learning
Generating highly accurate pathology reports from gigapixel whole slide images with HistoGPT2024medRxiv:2024.03.15.24304211v2
HIBRID: Histology and ct-DNA based Risk-stratification with Deep Learning2024medRxiv:2024.07.23.24310822
"SurvivMIL: A Multimodal, Multiple Instance Learning Pipeline for Survival Outcome of Neuroblastoma Patients"2024MICCAI Workshop on Computational Pathology with Multimodal Data (COMPAYL)
Early Fusion of H&E and IHC Histology Images for Pediatric Brain Tumor Classification2024MICCAI Workshop on Computational Pathology with Multimodal Data (COMPAYL)
Fluoroformer: Scaling multiple instance learning to multiplexed images via attention-based channel fusion2024ML4H symposium
Harnessing transcriptional regulation of alternative end-joining to predict cancer treatment2025NAR Cancer
A multimodal generative AI copilot for human pathology2024Nature
Digital profiling of gene expression from histology images with linearized attention2024Nature Communications
Demographic bias in misdiagnosis by computational pathology models2024Nature Medicine
Hest-1k: A dataset for spatial transcriptomics and histology image analysis2024Advanced in Neural Information Processing Systems
Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis2024Advanced in Neural Information Processing Systems
Leveraging tumor heterogeneity: Heterogeneous graph representation learning for cancer survival prediction in whole slide images2024Advanced in Neural Information Processing Systems
Going Beyond H&E and Oncology: How Do Histopathology Foundation Models Perform for Multi-stain IHC and Immunology?2024NeurIPS Workshop on Advancements In Medical Foundation Models
Histopathology and proteomics are synergistic for high-grade serous ovarian cancer platinum response prediction2025npj Precision Oncology
Deep learning for predicting prognostic consensus molecular subtypes in cervical cancer from histology images2025npj Precision Oncology
Integrated multicenter deep learning system for prognostic prediction in bladder cancer2024npj Precision Oncology
Predicting the tumor microenvironment composition and immunotherapy response in non-small cell lung cancer from digital histopathology images2024npj Precision Oncology
Artificial intelligence-based morphologic classification and molecular characterization of neuroblastic tumors from digital histopathology2024npj Precision Oncology
Deep Learning-Enabled Integration of Histology and Transcriptomics for Tissue Spatial Profile Analysis2025spj Research
Validation of histopathology foundation models through whole slide image retrieval2025Scientific Reports
Deep Learning Framework for Classifying Whole-slide Multiplex Immunofluorescence Images to Predict Immunotherapy Response in Melanoma Patients2024TechRxiv:10.36227/techrxiv.173496563.35713571
Deep learning-based lymph node metastasis status predicts prognosis from muscle-invasive bladder cancer histopathology2025World Journal of Urology

Pre-extracted Embeddings

To facilitate downstream tasks, we provide pre-extracted embeddings for the UNI 2 model (UNI2-h) for TCGA, CPTAC and PANDA, which can be downloaded here.

Benchmarking UNI 2

ROI Benchmarks

Model name Pretraining Model size HEST (Regression, Public) CRC-100K-Raw (9 classes, Public) TCGA Uniform Tumor (32 classes, Public) C17-WILDS (2 classes, Public) Kather MSI (2 classes, Public)
UNI Vision ViT-l/16 0.386 0.925 0.595 0.972 0.679
UNI2-h Vision ViT-h/14 0.414 0.957 0.675 0.977 0.722
Virchow 2 Vision ViT-h/14 0.398 0.952 0.620 0.975 0.713
Virchow Vision ViT-h/14 0.398 0.919 0.544 0.977 0.670
UNI2-g-preview Vision ViT-g/14 0.416 0.949 0.690 0.985 0.725
h-optimus Vision ViT-g/14 0.415 0.930 0.647 0.970 0.707
Prov-GigaPath Vision ViT-g/14 0.385 0.929 0.593 0.961 0.693
CONCH Vision-language ViT-b/16 0.371 0.941 0.556 0.967 0.685
MUSK Vision-language ViT-l/16 0.346 0.913 0.464 0.954 0.666

Slide Benchmarks

Model name Pretraining Model size EBRAINS (30 classes, Public) PANDA (5 classes, Public) IHC ER / PR Assess. (6 classes, Internal)
UNI Vision ViT-l/16 0.682 0.944 0.776
UNI2-h Vision ViT-h/14 0.711 0.946 0.794
Virchow 2 Vision ViT-h/14 0.691 0.931 0.808
Virchow Vision ViT-h/14 0.681 0.946 0.756
UNI2-g-preview Vision ViT-g/14 0.746 0.953 0.795
h-optimus Vision ViT-g/14 0.726 0.953 0.761
Prov-GigaPath Vision ViT-g/14 0.687 0.944 0.775
CONCH Vision-language ViT-b/16 0.689 0.934 0.794
MUSK Vision-language ViT-l/16 0.660 0.923 0.764

In each task, for each model, we sweep over 3 learning rates (1e-5, 5e-5, 1e-4) and report the test performance corresponding to the best performing model on the validation set.

For all assessments, all models are evaluated using the global representation (e.g. CLS token) without test time augmentation.

Installation

First clone the repo and cd into the directory:

git clone https://github.com/mahmoodlab/UNI.git
cd UNI

Then create a conda env and install the dependencies:

conda create -n UNI python=3.10 -y
conda activate UNI
pip install -e .

1. Getting access

Request access to the model weights from the Huggingface model page using links provided in the Model Weights section. You will need to login to Huggingface to download the model weights.

2. Downloading weights + Creating model

Following authentication (using huggingface_hub), the pretrained checkpoints and image transforms for UNI can be directly loaded using the timm library. This method automatically downloads the model weights to the huggingface_hub cache in your home directory, which timm will automatically find when using the commands below:

import timm
import torch
from timm.data import resolve_data_config
from timm.data.transforms_factory import create_transform
from huggingface_hub import login

login()  # login with your User Access Token, found at https://huggingface.co/settings/tokens

# pretrained=True needed to load UNI weights (and download weights for the first time)
# using UNI2-h as example
timm_kwargs = {
   'img_size': 224, 
   'patch_size': 14, 
   'depth': 24,
   'num_heads': 24,
   'init_values': 1e-5, 
   'embed_dim': 1536,
   'mlp_ratio': 2.66667*2,
   'num_classes': 0, 
   'no_embed_class': True,
   'mlp_layer': timm.layers.SwiGLUPacked, 
   'act_layer': torch.nn.SiLU, 
   'reg_tokens': 8, 
   'dynamic_img_size': True
  }
model = timm.create_model("hf-hub:MahmoodLab/UNI2-h", pretrained=True, **timm_kwargs)
transform = create_transform(**resolve_data_config(model.pretrained_cfg, model=model))
model.eval()

You can also download the model weights to a specified checkpoint location in your local directory. The timm library is still used for defining the model architecture (e.g. custom ViT-H/14). Pretrained weights and image transforms for UNI need to be manually loaded and defined.

import os
import torch
from torchvision import transforms
import timm
from huggingface_hub import login, hf_hub_download

login()  # login with your User Access Token, found at https://huggingface.co/settings/tokens

local_dir = "../assets/ckpts/uni2-h/"
os.makedirs(local_dir, exist_ok=True)  # create directory if it does not exist
hf_hub_download("MahmoodLab/UNI2-h", filename="pytorch_model.bin", local_dir=local_dir, force_download=True)
timm_kwargs = {
   'model_name': 'vit_giant_patch14_224',
   'img_size': 224, 
   'patch_size': 14, 
   'depth': 24,
   'num_heads': 24,
   'init_values': 1e-5, 
   'embed_dim': 1536,
   'mlp_ratio': 2.66667*2,
   'num_classes': 0, 
   'no_embed_class': True,
   'mlp_layer': timm.layers.SwiGLUPacked, 
   'act_layer': torch.nn.SiLU, 
   'reg_tokens': 8, 
   'dynamic_img_size': True
  }
model = timm.create_model(**timm_kwargs)
model.load_state_dict(torch.load(os.path.join(local_dir, "pytorch_model.bin"), map_location="cpu"), strict=True)
transform = transforms.Compose(
 [
  transforms.Resize(224),
  transforms.CenterCrop(224),
  transforms.ToTensor(),
  transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
 ]
)
model.eval()

The function get_encoder performs the commands above, downloading in the checkpoint in the ./assets/ckpts/ relative path of this GitHub repository.

from uni import get_encoder
model, transform = get_encoder(enc_name='uni2-h', device=device)

3. Running Inference

You can use the UNI pretrained encoder to extract features from histopathology ROIs, as follows:

from PIL import Image
image = Image.open("uni.jpg")
image = transform(image).unsqueeze(dim=0) # Image (torch.Tensor) with shape [1, 3, 224, 224] following image resizing and normalization (ImageNet parameters)
with torch.inference_mode():
 feature_emb = model(image) # Extracted features (torch.Tensor) with shape [1, 1536]

These pre-extracted features can then be used ROI classification (via linear probing), slide classification (via multiple instance learning), and other machine learning settings.

Overview of specific usages

We provide high-level functions for loading the model and using it for inference. For model loading, the function get_encoder performs the commands above in Step 2, downloading in the checkpoint in the ./assets/ckpts/ relative path of this GitHub repository.

from uni import get_encoder
model, transform = get_encoder(enc_name='uni2-h', device=device)

For inference:

from uni.downstream.extract_patch_features import extract_patch_features_from_dataloader
from uni.downstream.eval_patch_features.linear_probe import eval_linear_probe
from uni.downstream.eval_patch_features.fewshot import eval_knn, eval_fewshot
from uni.downstream.eval_patch_features.protonet import ProtoNet, prototype_topk_vote

Refer to the notebooks below for detailed examples.

More detailed starter code for loading / using the model:

See ./notebooks/uni_walkthrough.ipynb to get started with loading and using the model to create embeddings, and example code for extracting ROI features and performing ROI classification / retrieval.

License and Terms of Tuse

ⓒ Mahmood Lab. The models and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of the UNI models and their derivatives, which include models trained on outputs from the UNI models or datasets created from the UNI models, is prohibited and requires prior approval. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading the models, you agree not to distribute, publish or reproduce a copy of the models. If another user within your organization wishes to use the UNI models, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying models. If you are a commercial entity, please contact the corresponding author or Mass General Brigham Innovation Office.

Acknowledgements

The project was built on top of amazing repositories such as ViT, DINOv2, LGSSL, and Timm (ViT model implementation). We thank the authors and developers for their contribution.

Reference

If you find our work useful in your research or if you use parts of this code please consider citing our paper:

Chen, R.J., Ding, T., Lu, M.Y., Williamson, D.F.K., et al. Towards a general-purpose foundation model for computational pathology. Nat Med (2024). https://doi.org/10.1038/s41591-024-02857-3

@article{chen2024uni,
  title={Towards a General-Purpose Foundation Model for Computational Pathology},
  author={Chen, Richard J and Ding, Tong and Lu, Ming Y and Williamson, Drew FK and Jaume, Guillaume and Chen, Bowen and Zhang, Andrew and Shao, Daniel and Song, Andrew H and Shaban, Muhammad and others},
  journal={Nature Medicine},
  publisher={Nature Publishing Group},
  year={2024}
}

<img src=.github/joint_logo.jpg>