RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval (Medical Image Analysis)
November 16, 2023 ยท View on GitHub
Please open new threads or address all questions to xiyue.wang.scu@gmail.com
A better and stronger pre-trained model was built for various histopathological image applications. This model outperforms ImageNet pre-trained features by a large margin. We release our best model and invite researchers to test it on your computational pathology tasks.
Hardware
- 128GB of RAM
- 32*Nvidia V100 32G GPUs
Preparations
1.Download all TCGA 32000 WSIs.
2.Download all PAIP 2,457 WSIs. So, there will be about 15,000,000 images(~100T). It costs us $400,000 to advance the progress of digital pathology.
Pre-trained models for histopathological image tasks
This pre-train model is here
1.Classification through search
It is the most obvious and direct way to evaluate the distinctive power of the provided features.
| TissueNet | ||||
|---|---|---|---|---|
| Acc@1 | Acc@3 | Acc@5 | mMV@5 | |
| ImageNet | 50.35 | 77.65 | 87.68 | 46.15 |
| CCL (ours) | 67.09 | 87.81 | 93.4 | 70.1 |
| UniToPatho | ||||
|---|---|---|---|---|
| Acc@1 | Acc@3 | Acc@5 | mMV@5 | |
| ImageNet | 58.17 | 82.89 | 89.45 | 59.01 |
| CCL (ours) | 66.55 | 84.32 | 90.31 | 68.35 |
2.Multiple Instance Learning for Whole Slide Image Classification
This task is currently based on ImageNet pretrained features, which can also verify the superiority of our feature extractor.
| TCGA-NSCLC | ||
|---|---|---|
| Accuracy | AUC | |
| ABMIL | 0.7719 | 0.8656 |
| MIL-RNN | 0.8619 | 0.9107 |
| DSMIL | 0.8058 | 0.8925 |
| TransMIL | 0.8835 | 0.9603 |
| CLAM | 0.8422 | 0.9377 |
| CLAM+CCL (ours) | 0.911 | 0.967 |
3.Classification based on features using SVM
This task follows KimiaNet
| Colorectal cancer dataset | |
|---|---|
| Accuracy | |
| Combined features | 87.40 |
| Fine-tuned VGG-19 | 86.19 |
| Ensemble of CNNs | 92.83 |
| KamiaNet | 96.80 |
| CCL (ours) | 98.40 |
If you want to compute the features.
python get_feature.py
It is recommended to first try to extract features at 1.0mpp, and then try other magnifications
If you want to fine-tune model.
python resnet_lincls.py
Whole-Slide Images retrieval
You can refer to the third-party reproduction paper and code.
Please refer to FISH, when clustering and searching, use our features, then remove the Tree and search directly
License
RetCCL is released under the GPLv3 License and is available for non-commercial academic purposes.
Citation
Please use below to cite this paper if you find our work useful in your research.
@article{WANG2023102645,
title = {RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval},
author = {Xiyue Wang and Yuexi Du and Sen Yang and Jun Zhang and Minghui Wang and Jing Zhang and Wei Yang and Junzhou Huang and Xiao Han},
journal = {Medical Image Analysis},
volume = {83},
pages = {102645},
year = {2023},
issn = {1361-8415}
}