PrePATH: A Toolkit for Preprocessing Whole Slide Images
March 26, 2026 · View on GitHub
Tip
🚀 Contribute Your Foundation Model! We welcome submissions of new pathology foundation models to our benchmark. 👉 Submit Your Model Here — Help advance the field by adding your model to PrePATH!
PrePATH is a comprehensive preprocessing toolkit for whole slide images (WSI), built upon CLAM and ASlide.
TODO
- H0-mini
- OpenMidnight
- GenBio-PathFM
- TITAN (Slide level)
Installation
Prerequisites
- Anaconda or Miniconda
openslide-tools(system dependency)
Setup Instructions
The following instructions demonstrate installation for the GPFM model. For other foundation models, please refer to their respective repositories for environment-specific requirements.
git clone https://github.com/birkhoffkiki/PrePATH.git
cd PrePATH
conda create --name gpfm python=3.10
conda activate gpfm
pip install -r requirements/gpfm.txt
cd models/ckpts/
wget https://github.com/birkhoffkiki/GPFM/releases/download/ckpt/GPFM.pth
Notes:
- ASlide should be installed as a Python package from GitHub and is included in
requirements/gpfm.txt. - Environment configurations for other foundation models should be referenced from their respective repositories.
Usage
⚡Using PrePATH to extract Patch-Level Features
Step 1: Coordinate Extraction
Extract coordinates of foreground patches from whole slide images:
# Configure variables in the script before execution
bash scripts/get_coors/example.sh
Step 2: Feature Extraction
Extract patch-level features using the selected foundation model:
# Refer to the script for detailed configuration options
bash scripts/extract_feature/one_gpu_example.sh
If you have multiple GPUs, you can use the exe.sh script for parallel processing:
bash scripts/extract_feature/exe.sh
Step 3: (Optional) Extract patches and pack them into HDF5 files
This is useful for pretraining or if you meet the Corrupt JPEG data error during feature extraction.
This may happen for kfb or sdpc images due to limited support in multiprocessing.
# Refer to the script for detailed configuration options
bash scripts/crop_image/example_packed2h5.sh
Step 4: (Optional) Extract features from HDF5 packed patches
If you have packed patches into HDF5 files in Step 3, you can extract features from them directly:
# Refer to the script for detailed configuration options
bash scripts/extract_feature/one_gpu_from_h5_example.sh
⚡Extract patches directly without feature extraction (e.g., for pretraining)
Step 1: Coordinate Extraction
Extract coordinates of foreground patches from whole slide images:
# Configure variables in the script before execution
bash scripts/get_coors/example.sh
Step 2: Patch Extraction
Extract patches based on the coordinates:
We strongly recommend packing all patches using the HDF5 method for efficient storage and retrieval.
# Refer to the script for detailed configuration options
bash scripts/crop_image/example_packed2h5.sh
Supported Foundation Models
Note: Each foundation model requires its corresponding Python environment to be properly configured.
| Model | Identifier | Reference |
|---|---|---|
| ResNet50 | resnet50 | Standard ImageNet pretrained model |
| GPFM | gpfm | GitHub |
| CTransPath | ctranspath | GitHub |
| PLIP | plip | GitHub |
| CONCH | conch | HuggingFace |
| CONCH-1.5 | conch15 | HuggingFace |
| UNI | uni | HuggingFace |
| UNI-2 | uni2 | HuggingFace |
| mSTAR | mstar | GitHub |
| Phikon | phikon | HuggingFace |
| Phikon2 | phikon2 | HuggingFace |
| Virchow-2 | virchow2 | HuggingFace |
| Prov-GigaPath | gigapath | HuggingFace |
| CHIEF | chief | GitHub |
| H-Optimus-0 | h-optimus-0 | HuggingFace |
| H0-mini | h0-mini | HuggingFace |
| H-Optimus-1 | h-optimus-1 | HuggingFace |
| OpenMidnight | openmidnight | HuggingFace |
| GenBio-PathFM | genbio-pathfm | HuggingFace |
| Lunit | lunit | GitHub |
| Hibou-L | hibou-l | GitHub |
| MUSK | musk | HuggingFace |
| OmiCLIP | omiclip | Github |
| PathoCLIP | pathoclip | Github |
Supported WSI Formats
PrePATH supports the following whole slide image formats:
- KFB (.kfb)
- SDPC (.sdpc)
- TRON (.tron)
- All formats supported by OpenSlide (including .svs, .tiff, .ndpi, .vms, .vmu, .scn, .mrxs, .tif, .bif, and others)