Histopathology Datasets for Machine Learning

March 6, 2026 · View on GitHub

This is a list of histopathology datasets made public for classification, segmentation, regression and/or registration tasks.

I am happy if you want to help me update and/or improve this document. I think it helps to have an overview of all the datasets available in the field.

I hope this list will help some of you.

Overview

Resources

Please find in the table below some link and information about histopathology dataset that are publicly available.

Dataset nameOrgansStainingLinkSizeDataTaskWSI/PatchOther (Magnification, Scanner)year
ACDC-LungHP [1a], [1b]LungH&Edata, paperTrain: 150, Test: 50images + xmlseg + classiwsi2019
ACROBAT 2022 [66]BreastMultiple (IHC, H&E)data, paperTrain: 750 train; Valid: 100; Test: 300images (1 H&E match to 1-4 IHC) + landmarksregistrationwsi40x - Hamamatsu2022
Adipocyte [94]skinH&Edata, paper200 patchesimages+maskcell detectionpatch (120x150)40x2017
ADP [2]multiplemultiple (most H&E)data, github, paperTrain: 14.134, Valid: 1767, Test: 1767 (100 wsi)images + 57 hierarchical HTTs (histological tissue type)multi-label (3) classification (hierarchy)patch (1088x1088)40x - Huron TissueScope LE1.2 WSI2019
AGGCprostateH&Edata, paperSubset 1: train 105, test 45; Subset2: train 37 ,test 16; Subset3: train 144, test 67images + binary masksseg + gleason gradingwsi20x - Subset1 and Subset2: Akoya Biosciences Scanner, Subset3: each specimen is scanned by multiple scanners2022
AML-Cytomorphology_LMU [67]BloodWright's staindata, paper18.365 images from 200 patientsclassipatch (cells)100x - M8 digital microscope/scanner2019
ANHIR [3]multiple (Lung, Kidney, Colon, Gastric, Breast)multipledata, paper50+ setsimage + landmarksregistrationpatch (15k x 15k to 50k x 50k)40x, 20x, 10x, different scanner2019
ARCH [4]multiplemultipledata, paper4270images + captionlearn representation from text + imagepatchmultiple2020
BACH - ICIA2018 [5]BreastH&Edata, paper400images (4 classes: normal 100, benign: 100, in situ carcinoma: 100, invasive carcinoma: 100) + 20 unlabeled + 10 labeled WSI (10 patients)classi + segPatch (classi, 2048x1536) + WSI (seg)Leica SCN4002018
BCNB [6]BreastH&Edata, paper1058 (train 0.6, valid 0.2, test 0.2)images + roi annotated + patient recordbinary or multiple classiwsi2021
BCSS [7]BreastH&Edata, paper151 wsi, 20.000 patchpatch + segmentation masksemantic segpatch(TCGA)2019
Bone-Marrow-Cytomorphology [68]MarrowMay-Grünwald-Giemsa/Pappenheimdata, paper171.375 cells from 945 patientsimages + labelclassi (21)patch (250x250 - single cell)40x2021
BRACS [62]BreastH&Edata, paper547 wsi, 4539 ROIs, 189 Patientsimages + label (6 subtypes tumor + normal)classi (7)wsi + patch40x - Aperio AT22021
BreakHis [8]BreastH&Edata, paper7.909 (2480 benign, 5429 malignant)images + binary label + tumor type (8) (multiple magnifications: 40x, 100x, 200x, 400x)classiPatch (700x460)40x, 100x, 200x, 400x2016
BRCA-M2C [95]breastH&Edata, papertrain: 80, valid: 10, test: 30 patchesimages+point annotationmulti-class cell detectionpatch (around 500x500)20x2021
BreCaHAD [9]BreastH&Edata paper162images + centroid with labelclassi (6: mitosis, apoptosis, tumor nuclei, non-tumor nuclei, tubule, non-tubule)patch (1360x1024)40x - Zeiss2019
CAMEL [63]Colondata, paper177 wsi (156 with adenoma)image + label (binary)classipatch (1280x1280)2019
CAMELYON16 [10]Lymph nodeH&Edata, paperTrain: 270 (160 Normal, 110 with metastases); Test: 130images + binary masksclassi + segWSIslide level analysis2016
CAMELYON17 [11]Lymph nodeH&Edata, paperTrain: 500 (100 patients, 5 slides each); Test: 500images + binary masksclassi + segWSIpatient level analysis2017
CAMELYON [12]Breast (Lymph node)H&Epaper1399 wsiwsi2017
CATCH [88]Skin (Canine)H&Edata, paper350 wsi, 12.424 polygon annotations (13 classes)images + contours (JSON)seg + classiwsi40x Aperio ScanScope CS2 (Leica)2022
Cellseg [13]multiplemultipledata, paper, githubimages + limited labeled patchesinstance (cell) segmentationwsi2022
Chaoyang [57]ColonH&Edata, github, paperTrain: 111 normal, 842 serrated, 1404 adenocarcinoma, 664 adenoma, Test: 705 normal, 321 serrated, 840 adenocarcinoma, 273 adenoma samplesimages + labelclassipatch (512×512)2021
CoCaHis [61]ColonH&Edata, paper82 (19 patients)images + mask from different annotatorsegpatch2021
CoNIC 2022 [14]ColonH&Edata, github, paper4981 patch with 431.913 nuclei of 6 typesimage + instance seg mask + classi maskseg + classi + regpatch (256x256)20x2022
CoNSeP - HoVer-Net [15]Colorectal adenocarcinomaH&Edata, paperTrain: 27 images, Test: 14 images, 24.319 nucleiimages + nuclei (location + class)instance seg + classi (7: other, inflammatory, healthy epithelial, dysplastic/malignant epithelial, figroblast, muscle, endothelial)patch (1000x1000)40x (UHCW)2019
CPM-15 [16]brainH&Edata15 (2905 nuclei)images + nuclei seg + labelseg + classipatch (400x400, 600x1000)20x, 40x (TCGA)
CPM-17 [17]brainH&Edata, paperTrain: 32, test: 32 (7570 nuclei)images + nuclei seg + labelseg + classipatch (500x500 to 600x600)20x, 40x (TCGA)2019
CPTAC-AMLMarrow, Blooddata120 images from 88 patients40x2020
CPTAC-BRCABreastdata642 images from 134 patients40x2021
CPTAC-CCRCC [99]Clear Cell Renal Cell CarcinomaH&Edata783 tissue slides and 128 non-CCRCC tissue slidesimages + labelclassi (2)wsi-2024
CPTAC-COADColondata373 images from 106 patients40x2021
CPTAC-OVOvarydata222 images from 102 patients40x2021
CRAG - MILD-Net [18]ColonH&Edata, paperTrain: 173, Valid: 40image + segmentationinstance segpatch (around 1500x1500)20x2019
CRCHisto [19]ColonH&Edata, paper100 images, 29.756 nuclei (10 wsi, 9 patients)images + point nuclei class labelseg + classi (epithelial, inflammatory, fibroblast, miscellaneous)patch (500x500)20x - Omnyx VL120 (UHCW)2016
CRC-TP [20]CRCH&Edata, paper280k patches (from 20 wsi)images + tissue phenotypesclassipatch2020
CryoNuSeg [21]multiple (10: adrenal gland, larynx, lymph nodes, mediastinum, pancreas, pleura, skin, testes, thymus, and thyroid gland)H&Edata, github, paper8000 nuclei from 30 patches (from 30 wsi)images + segmentation masks + binary labelsnuclei segmentationpatch (512x512)40x (from TCGA)2021
DHMC-Kidney [85]Renal Cell CarcinomaH&Edata, paper563 wsiimages + labelclassiwsi20x - Aperio AT22021
DHMC-Lung [86]Lung AdenocarcinomaH&Edata, paper143 wsiimages + labelclassiwsi20x or 40x - Aperio AT22019
DiagSeg [58]ProstateH&Edata, paper>2.6M patches (from 430 scans) 430 fully annotated scans, 4675 scans with binary diagnosis, and 46 scans with diagnosis given independently by a group of 9 histopathologistsclassi (256×256)patch5x, 10x, 20x, 40x - Hamamatsu C12000-222021
DigestPath2019 - signet ring cell [22]multiple (Gastric, Intestine)H&Edata, paperTrain: 460, Test: 226images + cell bounding boxescell detectionpatch (avg 2kx2k)40x2019
DigestPath2019 - colonoscopy tissue segment [23]ColonH&Edata, paperTrain: 660, Test: 212images + lesion annotationseg + classi (benign vs malignant)patch (avg 5kx5k)20x2019
DLBCL-morphology [69]Lymph NodeMultiple (H&E, IHC)data, paper52.194 patches - 246 images from 209 patientsimages + ROIswsi - patch (240x240)40x - Aperio AT22022
ENDO-AID []Endometrial CarcinomaH&Edata, infoTest: 91 wsiimages + 15 pathologists assessmentsgrading scorewsi0.5um/px - 3DHistech P10002022
Follicle countring [98]OvaryH&Edata, paper, code643 cut slices (from 92 mice)images + labelObject Detection (3 classes)ROIsPanoramic 250 Flash, Slide Scanner (3DHISTECH Ltd. HUNGARY)2024
Gelasca et al. [26]BreastH&Edata50images (malignant/benignant, 1.895 nuclei) + masksclassi + segPatch (896x768; 768x512)
GlaS [24]Colorectal (Gland)H&Edata, paper165Train: 85 (37 benign, 48 malignant); Test: 80 (37 benign, 43 malignant)classi + segPatch (diff sizes - few hundred px)20x - Zeiss MIRAX MIDI2015
Gleason_CNN [25]ProstateH&Edata, github, paper5 tissue microarrays (200-300 spots)images + patch and pixel annotationclassipatch (3100x3100)40x - NanoZoomer-XR Digital slide scanner, Hamamatsu2018
GTEx Portal [77]MultipleH&Edata, paper948 patients (multiple slides per patients)images + genes + metadata
HER2 Contest [60]BreastMultiple (H&E, IHC)data, paper172 wsi from 86 patientsimage + label (scoring)classi (4 classes: 0, 1+, 2+, 3+)wsi4x-40x - Hamamatsu NanoZoomer C96002016
HEROHE - ECDP2020 [27a], [27b]BreastH&Edata, paperTrain: 359 (positive: 144, negatives: 215), Test: 150 (positive: 60, negative: 90)images + binary labelclassiwsi20x - 3D Histech Pannoramic 10002020
HER2 tumor ROIs [70]BreastH&Edata, paper273images + ROIs + labelclassi (binary)patch (512x512)20x - Aperio ScanScope2022
HISTAI [104]multipleH&E, IHC, Specialdata, paper112,801 WSIimages + pathology reportswsi20x, 40x, Leica Aperio2025
HunCRC [71]ColonH&Edata, github, github, paper101,389 patches - 200 wsi (from 200 patients)images + labelclassi (10)wsi - patch (512x512)40x - 3DHistech Pannoramic 10002022
IMP-CRS 2024 [81a],[81b],[81c]ColorectalH&Edata, paperTrain 4433 wsi, Test: 900 wsiimages + labelclassi (3)wsi40x - Leica GT4502024
Ivy-GAP [103]BrainH&E, ISHdata, paper41cases,42tumors,947blocks,30304sectionsimages+RNA-seq+clinical datawsi20x2018
Janowczyk et al. [28]BreastH&Edata, github143images (12.000 nuclei) + maskssemantic segPatch (2000x2000)40x2015
Kather et al. [29]ColonH&Edata, github, paperTrain: 100k (86 wsi), Valid: 7180 (25 wsi)image + label (9 tissue type)classipatch (224x224)2018
Kather et al. [30]ColonH&Edata, data, data, github, paperseg (tumor detection) + classi (MSI detection)2019
KIMIA Path24C [65]multiplemultiple (IHC, H&E, Masson's trichrome)data, paperTrain: 22.591, Valid: 1.325 from 24 wsipatch (1000x1000)20x - TissueScope LE 1.0.2021
Komura et al. [64]multiple (32)H&Edata, paper271.700images + cancer typeclassipatch (256x256)6 magnification (from TCGA)2021
Kumar [31]multiple (8)H&Edata, paperTrain: 16 (13.372 nuclei), test same organ (4.130 nuclei): 8, test diff organ (4.121 nuclei): 6images + nuclei seg + labelseg + classipatch (1000x1000)40x (TCGA)2017
LC25000 [54]multiple (lung, colon)H&Edata, paper25.000 (5 classes)images + labelpatch (768x768)classi60x2019
Lizard [32]ColonH&Edata, paper495.179 nucleiimages + instance seg masksegpatch20x (DigestPath + CRAG + GlaS + PanNuke + CoNSeP + TCGA)2021
LubLung [97]LungH&Edata, paper23,199 patches (9 classes)images + labelsclassipatch (87x87)2021
LYON19 [33]Multiple (Breast, Colon, Protate)IHCdata, paperTest: 441 ROIs - 171.166 cellsimages + corrdinates of cellcell detectionpatchPannoramic 250Flash II scanner2019
MBM [94]boneH&Edata, paper44 patchesimages+maskcell detectionpatch (600x600)40x2017
MHIST [79]colorectal polypsH&Edata, paper3,152 patches (train: 2,175; test: 977)images + annotations + annotator agreementclassi (2)patch (224x224)40x - Aperio AT22021
MIDOG 2021 [34]BreastH&Edata, paper200 wsi: 50 wsi / scanners - 4 scannersimages + roidetection of mitotic figueswsi2021
MIDOG 2022 [35]multiple (6 for train 10 for test)H&EdataTrain: 405 cases, 9501 mitotic annotationimages + segsegPatch2022
MIDOG++ [93]multipleH&Edata, paper503 ROIs + 12k mitotic figuresimages + object centersdetection of mitotic figuresROIs2023
MITOS_WSI_CCMCT [89]Skin (Canine)H&Edata, paper32 wsiimages + mitotic figures (45k)/ hard negatives (28k)detection of mitotic figueswsi40x Aperio ScanScope CS2 (Leica)2019
MITOS_WSI_CMC [90]Breast (Canine)H&Edata, paper21 wsiimages + mitotic figures (14k)/ hard negatives (35k)detection of mitotic figueswsi40x Aperio ScanScope CS2 (Leica)2020
MoNuSAC 2020 [36]multiple (Lung, Prostate, Kidney, Breast)H&Edata, paper31.411 nuclei from 209 imagesimages + maskinstance seg + classipatch (81x113 to 1422x2162)40x (TCGA)2020
MoNuSeg [37a], [37b]multiple (7)H&Edata, github, paperTrain: 30, Test: 14images (Train: 22.000 nuclei, Test: 7000) + masksinstance segPatch (1000x1000)40x (from TCGA)2018
Multi-Scanner SCC [92]Skin (Canine)H&Edata, paper44 samples á 5 scanners (220 wsi)images + contours (JSON)registration + segmentationwsi5 scanners2023
NADT-Prostate [72]ProstateMultiple (H&E, IHC)data, paper1401 images from 37 patients20x2021
Naylor et al. [38]BreastH&Edata, paper50images (4.022 nuclei, 11 patients) + maskssegPatch (512x512)40x2018
NuClick [59]LymphocyteIHCdata, paperTrain: 671, Valid: 200images + masksegpatch (256x256)2020
NuCLS [39]BreastH&Edata, paper220.000 nuclei from 3.944 roi from 125 patientsroi + bounding bx + classificationnuclear detection + classi + segpatch(TCGA)2021
OCELOT [78]Multiple (Bladder, Endometrium, Head-and-neck, Kidney, Prostate, Stomach)H&Edata, paper, website304 Whole Slide Images (WSIs) (tr:val:te 6:2:2)images + cell annotation + tissue annotationcell and tissue detection (multitask learning)patch (1024x1024)(TCGA)2023
Osteosarcoma-Tumor-AssessmentBoneH&Edata1144 images from 4classi (3: non-tumor, viable tumor, necrosis)patch (1024x1024)10x2019
Ovarian Bevacizumab Response [73a], [73b]OvaryH&Edata, paper, paper288 (78 patients)images + clinical informationclassi (treatment effectiveness)wsi (avg 54342x41048)20x - Leica AT22021
PAIP2019 [40]LiverH&Edata, paperTrain: 50, Valid: 10, Test: 40images + binary maskcancer segwsi20x - Aperio AT22019
PAIP2020 [41]ColonH&Edata, githubTrain: 47, Valid: 31, Test: 40images + binary maskcancer segwsi40x - Aperio AT22020
PAIP2021 [42]Multiple (Colon, Prostate, Pancreas)H&Edata, paperTrain: 150, Valid: 30, Test: 60wsi + xml gtsemantic segwsi20x - Aperio AT22021
PAIP2023multiple organH&Edata2023
The PANDA challenge [43]ProstateH&Edata, paperTrain: 10.616, Valid: 393, Internal test: 545, External test: 1071images + labelclassiwsislide level analysis2020
Pan-tumor T-lymphocyte dataset [91]MultipleIHC (CD3)data, paper92 ROIsimages + cell annotationsdetection + classificationwsi40x NanoZoomer 2.0-HT (Hamamatsu)2023
SegPath [87]multipleH&Edata, paper158,687 patchesimages + label + masksemantic segpatch20x - Zeiss MIRAX MIDI2023
PanNuke [44a], [44b]multiple (19)H&Edata, github, paper, paper189.744 nuclei (from >20k wsi)images + nuclei (position + classi: neoplastic, connective, non-neoplastic epithelial, dead, inflammatory)instance seg + classipatch40x2019
PatchCamelyon [45a], [45b]Lymph nodeH&Edata, github paper327.680images + binary labelclassiPatch (96x96)10x2018
PATHVQA [80]MultipleMultipledata, paper, github32,799 open-ended questions from 4,998 imagesimage + question + answerVQApatch/image2020
Post-NAT-BRCA [74]BreastH&Edata, paper96 images from 54 patientsimages + clinical info + annotation tumor cellularity and cell labelswsi20x - Aperio2021
Prostate Fused-MRI-Pathology [83]ProstateH&Edata114 images from 16 patientsimages + tumor Annotations + mpMRIwsi20x - Aperio2016
PUMA [102]MelanomaH&Edata, paper206 ROIs (primary: 103, metastatic: 103)images + nuclei and tissue annotations + context imagenuclei and tissue segmentationpatch(1024x1024) context(5120x5120)40x - Nanozoomer XR C12000–21/–222024
RINGS [96]prostateH&Edata , papertrain: 1000 , test: 500 with 18'851 glandsimages+maskgland segmentation and tumor segmentationpatch (1500x1500)40x2021
SegLungTCGA [97]LungH&Edata, paper454 images + file mapping infoimagessegmentation (9 classes)segmented (87x87 patches) wsi(from TCGA)2021
SegPC-2021 [46a], [46b], [46c], [46d]BloodJenner-Giemsadata, github, report775 images, Train: 298, Valid: 200, Test: 277images + nucleus and cytoplasmaplasma cell segmentation2021
SICAPv2 [55]ProstateH&Edata, paper155 (from 95 patients)images + global Gleason scores and patch-level Gleason gradesclassiwsi40x - Ventana iScan Coreo2020
SLN-Breast [75]BreastH&Edata, paper130 wsi from 78 patientsimages + binary labelclassi (binary - cancer/no cancer)wsi20x - Leica Aperio AT22021
SPIDER [101]Multiple (Skin, Colorectal, Thorax)H&Edata, paperSkin: 159,854 patches, Colorectal: 77,182 patches, Thorax: 78,307 patchesimages + labelsclassificationpatch (1120x1120)20x magnification2025
SPIE-AAPM_NCI BreastPathQ [47]BreastH&Edata, paper2579 patch from 96 wsi (64 patients)images + scoreregressionpatches20x2019
TCGA [48]MultipleH&Edata, data> 11kWSI
TCGA-TIL-WSI [76]Multiple (13)H&Edata, github, paper5200(from TCGA)2019
The Digital Brain Tumour Atlas [100]BrainH&Edata, paper3,115 histological slides of 2,880 patientsimages + labelclassiwsi-2022
TIGER [49]BreastH&Edata, paper, github, githubWSIROIS: 195 wsi, WSIBULK: 93, WSITILS: 82images + rois + label (7)detection + segmentation + TILs scoringwsi(from TCGA, RUMC, JB)2022
TissueNetUterine cervixH&Edata, github1,016 WSIs; 5,926 patches (1200x1200 px)images + annotation + metadata + labelsclassi (4)wsi + patchesMIRAX, Aperio, Hamamatsu2020
TNBC [50]BreastH&Edata, data, paper50 images, 4022 cells (11 patients)images + nuclei seg + labelseg + classipatch (512x512)40x - Philips Ultra Fast Scanner (Curie Inst.)2019
Tolkach Y. et al. [84]oesophageal adenocarcinomasH&Edata, paperUKK1: 34,704 patches from 22 wsi (20 patients); WNS: 121,642 patches from 62 wsi (15 patients); CHA: 32,796 patches from 214 wsi (69 patients); TCGA:178,187 patches from 22 wsi (22 patients)images + labelclassi (11)patch(256x256)40x - Nanozoomer S3602023
TUPAC16 [51]BreastH&Edata, paper500images + labelclassi (wsi level)WSI40x (from TCGA)2019
TUPAC16 - aux [52]Breast - mitosesH&Edata73images + locationssegpatch40x (from TCGA) Leica SCN4002019
UniToPatho [56]ColonH&Edata, paper9.536 from 292 wsiimages + label (6 classes)classipatch20x - Hamamatsu Nanozoomer S2102021
UPENN-GBM [82]glioblastomaH&Edata,paper71 wsi from 34 patientsimages + clinical data + mpMRIWSI40x2022
VisioMelMelanomaH&Edata, codetrain: 1342 wsi, test: 600, valid: 1200, 16 WSIs annotatedimages + annotation + clinical metadata + labelclassi (2)2023
WSSS4LUAD [53]LungH&Edata, paper87 (Train: 53, valid: 12, Test: 12)Train: 10.091 patches, Valid: 40 patches, Test: 80 patches; image level for train, pixel level for test/validtissue semantic segwsi(67 GDPH, 20 TCGA)2021

References

[1a] Li, Zhang, et al. "Computer-aided diagnosis of lung carcinoma using deep learning-a pilot study." arXiv preprint arXiv:1803.05471 (2018).

[1b] Li, Zhang, et al. "Deep learning methods for lung cancer segmentation in whole-slide histopathology images—the acdc@ lunghp challenge 2019." IEEE Journal of Biomedical and Health Informatics 25.2 (2020): 429-440.

[2] Hosseini, Mahdi S., et al. "Atlas of digital pathology: A generalized hierarchical histological tissue type-annotated database for deep learning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.

[3] Borovec, Jiří, et al. "ANHIR: automatic non-rigid histological image registration challenge." IEEE transactions on medical imaging 39.10 (2020): 3042-3052.

[4] Gamper, Jevgenij, and Nasir Rajpoot. "Multiple instance captioning: Learning representations from histopathology textbooks and articles." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.

[5] Aresta, Guilherme, et al. "Bach: Grand challenge on breast cancer histology images." Medical image analysis 56 (2019): 122-139.

[6] Xu, Feng, et al. "Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides." Frontiers in oncology 11 (2021): 759007.

[7] Amgad, Mohamed, et al. "Structured crowdsourcing enables convolutional segmentation of histology images." Bioinformatics 35.18 (2019): 3461-3467.

[8] Spanhol, Fabio A., et al. "A dataset for breast cancer histopathological image classification." Ieee transactions on biomedical engineering 63.7 (2015): 1455-1462.

[9] Aksac, Alper, et al. "BreCaHAD: a dataset for breast cancer histopathological annotation and diagnosis." BMC research notes 12.1 (2019): 1-3.

[10] Bejnordi, Babak Ehteshami, et al. "Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer." Jama 318.22 (2017): 2199-2210.

[11] Bandi, Peter, et al. "From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge." IEEE transactions on medical imaging 38.2 (2018): 550-560.

[12] Litjens, Geert, et al. "1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset." GigaScience 7.6 (2018): giy065.

[13] Kwanyoung Lee, Hyungjo Byun, Hyunjung Shim Proceedings of The Cell Segmentation Challenge in Multi-modality High-Resolution Microscopy Images, PMLR 212:1-11, 2023.

[14] Graham, Simon, et al. "Conic: Colon nuclei identification and counting challenge 2022." arXiv preprint arXiv:2111.14485 (2021).

[15] Graham, Simon, et al. "Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images." Medical Image Analysis 58 (2019): 101563.

[16] Vu, Quoc Dang, et al. "Methods for segmentation and classification of digital microscopy tissue images." Frontiers in bioengineering and biotechnology (2019): 53.

[17] Vu, Quoc Dang, et al. "Methods for segmentation and classification of digital microscopy tissue images." Frontiers in bioengineering and biotechnology (2019): 53.

[18] Graham, Simon, et al. "MILD-Net: Minimal information loss dilated network for gland instance segmentation in colon histology images." Medical image analysis 52 (2019): 199-211.

[19] Sirinukunwattana, Korsuk, et al. "Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images." IEEE transactions on medical imaging 35.5 (2016): 1196-1206.

[20] Javed, Sajid, et al. "Cellular community detection for tissue phenotyping in colorectal cancer histology images." Medical image analysis 63 (2020): 101696.

[21] Mahbod, Amirreza, et al. "CryoNuSeg: A dataset for nuclei instance segmentation of cryosectioned H&E-stained histological images." Computers in biology and medicine 132 (2021): 104349.

[22] Li, Jiahui, et al. "Signet ring cell detection with a semi-supervised learning framework." International conference on information processing in medical imaging. Springer, Cham, 2019.

[23] Li, Jiahui, et al. "Signet ring cell detection with a semi-supervised learning framework." International conference on information processing in medical imaging. Springer, Cham, 2019.

[24] Sirinukunwattana, Korsuk, et al. "Gland segmentation in colon histology images: The glas challenge contest." Medical image analysis 35 (2017): 489-502.

[25] Arvaniti, Eirini, et al. "Automated Gleason grading of prostate cancer tissue microarrays via deep learning." Scientific reports 8.1 (2018): 1-11.

[26]

[27a] Conde-Sousa, Eduardo, et al. "HEROHE Challenge: assessing HER2 status in breast cancer without immunohistochemistry or in situ hybridization." arXiv preprint arXiv:2111.04738 (2021).

[27b] La Barbera, David, et al. "Detection of her2 from haematoxylin-eosin slides through a cascade of deep learning classifiers via multi-instance learning." Journal of Imaging 6.9 (2020): 82.

[28]

[29] Kather, Jakob Nikolas, Halama, Niels, & Marx, Alexander. (2018). 100,000 histological images of human colorectal cancer and healthy tissue (v0.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.1214456

[30] Kather, Jakob Nikolas, et al. "Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer." Nature medicine 25.7 (2019): 1054-1056.

[31] Kumar, Neeraj, et al. "A dataset and a technique for generalized nuclear segmentation for computational pathology." IEEE transactions on medical imaging 36.7 (2017): 1550-1560.

[32] Graham, Simon, et al. "Lizard: a large-scale dataset for colonic nuclear instance segmentation and classification." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

[33] Swiderska-Chadaj, Zaneta, et al. "Learning to detect lymphocytes in immunohistochemistry with deep learning." Medical image analysis 58 (2019): 101547.

[34] Aubreville, Marc, et al. "Mitosis domain generalization in histopathology images--The MIDOG challenge." arXiv preprint arXiv:2204.03742 (2022).

[35] Aubreville, Marc, et al. "Mitosis domain generalization challenge (2021)." 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). 2022.

[36] Verma, Ruchika, et al. "MoNuSAC2020: A multi-organ nuclei segmentation and classification challenge." IEEE Transactions on Medical Imaging 40.12 (2021): 3413-3423.

[37a] Kumar, Neeraj, et al. "A multi-organ nucleus segmentation challenge." IEEE transactions on medical imaging 39.5 (2019): 1380-1391.

[37b] Kumar, Neeraj, et al. "A dataset and a technique for generalized nuclear segmentation for computational pathology." IEEE transactions on medical imaging 36.7 (2017): 1550-1560.

[38] Naylor, Peter, et al. "Segmentation of nuclei in histopathology images by deep regression of the distance map." IEEE transactions on medical imaging 38.2 (2018): 448-459.

[39] Amgad, Mohamed, et al. "Nucls: A scalable crowdsourcing, deep learning approach and dataset for nucleus classification, localization and segmentation." arXiv preprint arXiv:2102.09099 (2021).

[40] Kim, Yoo Jung, et al. "PAIP 2019: Liver cancer segmentation challenge." Medical Image Analysis 67 (2021): 101854.

[41]

[42] Nateghi, Ramin, and Fattaneh Pourakpour. "Perineural Invasion Detection in Multiple Organ Cancer Based on Deep Convolutional Neural Network." arXiv preprint arXiv:2110.12283 (2021).

[43] Bulten, Wouter, et al. "Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge." Nature medicine 28.1 (2022): 154-163.

[44a] Gamper, Jevgenij, et al. "Pannuke: an open pan-cancer histology dataset for nuclei instance segmentation and classification." European congress on digital pathology. Springer, Cham, 2019.

[44b] Gamper, Jevgenij, et al. "Pannuke dataset extension, insights and baselines." arXiv preprint arXiv:2003.10778 (2020).

[45a] Veeling, Bastiaan S., et al. "Rotation equivariant CNNs for digital pathology." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2018.

[45b] Bejnordi, Babak Ehteshami, et al. "Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer." Jama 318.22 (2017): 2199-2210.

[46a] Gupta, Anubha, et al. "Segpc-2021: Segmentation of multiple myeloma plasma cells in microscopic images." IEEE Dataport 1.1 (2021): 1.

[46b] Gupta, Anubha, et al. "GCTI-SN: Geometry-inspired chemical and tissue invariant stain normalization of microscopic medical images." Medical Image Analysis 65 (2020): 101788.

[46c] Gehlot, Shiv, Anubha Gupta, and Ritu Gupta. "Ednfc-net: Convolutional neural network with nested feature concatenation for nuclei-instance segmentation." ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020.

[46d] Gupta, Anubha, et al. "PCSeg: Color model driven probabilistic multiphase level set based tool for plasma cell segmentation in multiple myeloma." PloS one 13.12 (2018): e0207908.

[47] Petrick, Nicholas A., et al. "SPIE-AAPM-NCI BreastPathQ Challenge: an image analysis challenge for quantitative tumor cellularity assessment in breast cancer histology images following neoadjuvant treatment." Journal of Medical Imaging 8.3 (2021): 034501.

[48] R. L. Grossman, A. P. Heath, V. Ferretti, H. E. Varmus, D. R. Lowy, W. A. Kibbe, and L. M. Staudt. Toward a shared vision for cancer genomic data. New England Journal of Medicine, 375(12):1109–1112, 2016.

[49] Shephard, Adam, et al. "TIAger: Tumor-Infiltrating Lymphocyte Scoring in Breast Cancer for the TiGER Challenge." arXiv preprint arXiv:2206.11943 (2022).

[50] P. Naylor, M. Laé, F. Reyal and T. Walter, "Segmentation of Nuclei in Histopathology Images by Deep Regression of the Distance Map," in IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 448-459, Feb. 2019, doi: 10.1109/TMI.2018.2865709

[51] Veta, Mitko, et al. "Predicting breast tumor proliferation from whole-slide images: the TUPAC16 challenge." Medical image analysis 54 (2019): 111-121.

[52] Veta, Mitko, et al. "Predicting breast tumor proliferation from whole-slide images: the TUPAC16 challenge." Medical image analysis 54 (2019): 111-121.

[53] Han, Chu, et al. "WSSS4LUAD: Grand Challenge on Weakly-supervised Tissue Semantic Segmentation for Lung Adenocarcinoma." arXiv preprint arXiv:2204.06455 (2022).

[54] Borkowski, Andrew A., et al. "Lung and colon cancer histopathological image dataset (lc25000)." arXiv preprint arXiv:1912.12142 (2019).

[55] Silva-Rodríguez, Julio, et al. "Going deeper through the Gleason scoring scale: An automatic end-to-end system for histology prostate grading and cribriform pattern detection." Computer Methods and Programs in Biomedicine 195 (2020): 105637.

[56] Barbano, Carlo Alberto, et al. "UniToPatho, a labeled histopathological dataset for colorectal polyps classification and adenoma dysplasia grading." 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 2021.

[57] Zhu, Chuang, et al. "Hard Sample Aware Noise Robust Learning for Histopathology Image Classification." IEEE Transactions on Medical Imaging 41.4 (2021): 881-894.

[58] Koziarski, Michał, et al. "DiagSet: a dataset for prostate cancer histopathological image classification." arXiv preprint arXiv:2105.04014 (2021).

[59] Koohbanani, Navid Alemi, et al. "NuClick: a deep learning framework for interactive segmentation of microscopic images." Medical Image Analysis 65 (2020): 101771.

[60] Qaiser, Talha, et al. "Her 2 challenge contest: a detailed assessment of automated her 2 scoring algorithms in whole slide images of breast cancer tissues." Histopathology 72.2 (2018): 227-238.

[61] Sitnik, Dario, et al. "A dataset and a methodology for intraoperative computer-aided diagnosis of a metastatic colon cancer in a liver." Biomedical Signal Processing and Control 66 (2021): 102402.

[62] Brancati, Nadia, et al. "Bracs: A dataset for breast carcinoma subtyping in h&e histology images." arXiv preprint arXiv:2111.04740 (2021).

[63] Xu, Gang, et al. "Camel: A weakly supervised learning framework for histopathology image segmentation." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.

[64] Komura, Daisuke, et al. "Universal encoding of pan-cancer histology by deep texture representations." Cell Reports 38.9 (2022): 110424.

[65] Shafiei, Sobhan, et al. "Colored Kimia Path24 Dataset: Configurations and Benchmarks with Deep Embeddings." arXiv preprint arXiv:2102.07611 (2021).

[66] Weitz, Philippe, et al. "ACROBAT-Automatic Registration of Breast Cancer Tissue."

[67] Matek, Christian, et al. "Human-level recognition of blast cells in acute myeloid leukaemia with convolutional neural networks." Nature Machine Intelligence 1.11 (2019): 538-544.

[68] Matek, Christian, et al. "Highly accurate differentiation of bone marrow cell morphologies using deep neural networks on a large image data set." Blood, The Journal of the American Society of Hematology 138.20 (2021): 1917-1927.

[69] Vrabac, Damir, et al. "DLBCL-Morph: Morphological features computed using deep learning for an annotated digital DLBCL image set." Scientific Data 8.1 (2021): 1-8.

[70] Farahmand, Saman, et al. "Deep learning trained on hematoxylin and eosin tumor region of Interest predicts HER2 status and trastuzumab treatment response in HER2+ breast cancer." Modern Pathology 35.1 (2022): 44-51.

[71] Pataki, Bálint Ármin, et al. "HunCRC: annotated pathological slides to enhance deep learning applications in colorectal cancer screening." Scientific Data 9.1 (2022): 1-7.

[72] Wilkinson, Scott, et al. "Nascent prostate cancer heterogeneity drives evolution and resistance to intense hormonal therapy." European urology 80.6 (2021): 746-757.

[73a] Wang, Ching-Wei, et al. "Histopathological whole slide image dataset for classification of treatment effectiveness to ovarian cancer." Scientific Data 9.1 (2022): 1-5.

[73b] Wang, Ching-Wei, et al. "Weakly supervised deep learning for prediction of treatment effectiveness on ovarian cancer from histopathology images." Computerized Medical Imaging and Graphics 99 (2022): 102093.

[74] Peikari, Mohammad, et al. "Automatic cellularity assessment from post‐treated breast surgical specimens." Cytometry Part A 91.11 (2017): 1078-1087.

[75] Campanella, Gabriele, et al. "Clinical-grade computational pathology using weakly supervised deep learning on whole slide images." Nature medicine 25.8 (2019): 1301-1309.

[76] Saltz, Joel, et al. "Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images." Cell reports 23.1 (2018): 181-193.

[77] Lonsdale, John, et al. "The genotype-tissue expression (GTEx) project." Nature genetics 45.6 (2013): 580-585.

[78] Ryu, Jeongun, et al. "OCELOT: Overlapped Cell on Tissue Dataset for Histopathology." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.

[79] Jerry Wei, Arief Suriawinata, Bing Ren, Xiaoying Liu, Mikhail Lisovsky, Louis Vaickus, Charles Brown, Michael Baker, Naofumi Tomita, Lorenzo Torresani, Jason Wei, Saeed Hassanpour, “A Petri Dish for Histopathology Image Analysis”, International Conference on Artificial Intelligence in Medicine (AIME), 12721:11-24, 2021.

[80] Do, Tuong, et al. "Multiple meta-model quantifying for medical visual question answering." Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part V 24. Springer International Publishing, 2021.

[81a] Oliveira, S.P., Neto, P.C., Fraga, J., Montezuma, D., Monteiro, A., Monteiro, J., Ribeiro, L., Gonçalves, S., Pinto, I.M. and Cardoso, J.S., 2021. CAD systems for colorectal cancer from WSI are still not ready for clinical acceptance. Scientific Reports, 11(1), pp.1-15. https://doi.org/10.1038/s41598-021-93746-z

[81b] Neto, P.C., Oliveira, S.P., Montezuma, D., Fraga, J., Monteiro, A., Ribeiro, L., Gonçalves, S., Pinto, I.M. and Cardoso, J.S., 2022. iMIL4PATH: A semi-supervised interpretable approach for colorectal whole-slide images. Cancers, 14(10). https://doi.org/10.3390/cancers14102489

[81c] Neto, P.C., Montezuma, D., Oliveira, S.P., Oliveira, D., Fraga, J., Monteiro, A., Monteiro, J., Ribeiro, L., Gonçalves, S., Reinhard, S., Zlobec ,I. , Pinto, I.M. and Cardoso, J.S., 2024. An interpretable machine learning system for colorectal cancer diagnosis from pathology slides. npj Precision Oncology. https://doi.org/10.1038/s41698-024-00539-4

[82] Bakas, Spyridon, et al. "The University of Pennsylvania glioblastoma (UPenn-GBM) cohort: advanced MRI, clinical, genomics, & radiomics." Scientific data 9.1 (2022): 453.

[83] Madabhushi, A., & Feldman, M. (2016). Fused Radiology-Pathology Prostate Dataset (Prostate Fused-MRI-Pathology) . The Cancer Imaging Archive. doi; 10.7937/k9/TCIA.2016.tlpmr1am

[84] Tolkach, Yuri, et al. "Artificial intelligence for tumour tissue detection and histological regression grading in oesophageal adenocarcinomas: a retrospective algorithm development and validation study." The Lancet Digital Health 5.5 (2023): e265-e275.

[85] Mengdan Zhu, Bing Ren, Ryland Richards, Matthew Suriawinata, Naofumi Tomita, Saeed Hassanpour, "Development and Evaluation of a Deep Neural Network for Histologic Classification of Renal Cell Carcinoma on Biopsy and Surgical Resection Slides", Scientific Reports;11:7080 (2021).

[86] Jason Wei, Laura Tafe, Yevgeniy Linnik, Louis Vaickus, Naofumi Tomita, Saeed Hassanpour, "Pathologist-level Classification of Histologic Patterns on Resected Lung Adenocarcinoma Slides with Deep Neural Networks", Scientific Reports;9:3358 (2019).

[87] Daisuke Komura, Takumi Onoyama, Koki Shinbo, Hiroto Odaka, Minako Hayakawa, Mieko Ochi, Ranny Rahaningrum Herdiantoputri, Haruya Endo, Hiroto Katoh, Tohru Ikeda, Tetsuo Ushiku, Shumpei Ishikawa, Restaining-based annotation for cancer histology segmentation to overcome annotation-related limitations among pathologists, Patterns, Volume 4, Issue 2, 2023, 100688, https://doi.org/10.1016/j.patter.2023.100688.

[88] Wilm, Frauke, Fragoso, Marco, Marzahl, Christian, Qiu, Jingna, Puget, Chloé, Diehl, Laura, Bertram, Christof A., Klopfleisch, Robert, Maier, Andreas, Breininger, Katharina and Aubreville, Marc. "Pan-tumor CAnine cuTaneous Cancer Histology (CATCH) dataset". Sci Data 9, 588 (2022). https://doi.org/10.1038/s41597-022-01692-w

[89] Bertram, Christof A., Aubreville, Marc, Marzahl, Christian, Maier, Andreas and Klopfleisch, Robert. "A large-scale dataset for mitotic figure assessment on whole slide images of canine cutaneous mast cell tumor". Sci Data 6, 274 (2019). https://doi.org/10.1038/s41597-019-0290-4

[90] Aubreville, Marc, Bertram, Christof A., Donovan, Taryn A., Marzahl, Christian, Maier, Andreas and Klopfleisch, Robert. "A completely annotated whole slide image dataset of canine breast cancer to aid human breast cancer research". Sci Data 7, 417 (2020). https://doi.org/10.1038/s41597-020-00756-z

[91] Wilm, Frauke et al. "Pan-tumor T-lymphocyte detection using deep neural networks: Recommendations for transfer learning in immunohistochemistry". Journal of Pathology Informatics. Vol. 14, 100301 (2023).

[92] Wilm, Frauke, Fragoso, Marco, Bertram, Christof A., Stathonikos, Nikolas, Öttl, Mathias, Qiu, Jingna, Klopfleisch, Robert, Maier, Andreas, Breininger, Katharina and Aubreville, Marc. Proceedings of the German Workshop on Medical Image Processing (BVM), pp 206–211 (2023).

[93] Aubreville Marc, Wilm, Frauke et al. "A comprehensive multi-domain dataset for mitotic figure detection". Sci Data 10, 484 (2023). https://doi.org/10.1038/s41597-023-02327-4

[94] Paul Cohen, Joseph, et al. "Count-ception: Counting by fully convolutional redundant counting." Proceedings of the IEEE International conference on computer vision workshops. 2017. https://doi.org/10.1109/ICCVW.2017.9

[95] Huang, Junjia, et al. "Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.

[96] Salvi, Massimo, et al. "A hybrid deep learning approach for gland segmentation in prostate histopathological images." Artificial Intelligence in Medicine 115 (2021): 102076.

[97] Rączkowska, A., Paśnik, I., Kukiełka, M. et al. "Deep learning-based tumor microenvironment segmentation is predictive of tumor mutations and patient survival in non-small-cell lung cancer." BMC Cancer 22, 1001 (2022).

[98] Blot, V. et al. "Efficient Precision Control in Object Detection Models for Enhanced and Reliable Ovarian Follicle Counting". In: Sudre, C.H., Mehta, R., Ouyang, C., Qin, C., Rakic, M., Wells, W.M. (eds) Uncertainty for Safe Utilization of Machine Learning in Medical Imaging. UNSURE 2024. Lecture Notes in Computer Science, vol 15167. Springer, Cham. https://doi.org/10.1007/978-3-031-73158-7_17

[99] National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Clear Cell Renal Cell Carcinoma Collection (CPTAC-CCRCC) (Version 13) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/k9/tcia.2018.oblamn27

[100] Roetzer-Pejrimovsky, T., Moser, A. C., Atli, B., Vogel, C. C., Mercea, P. A., Prihoda, R., Gelpi, E., Haberler, C., Höftberger, R., Hainfellner, J. A., Baumann, B., Langs, G., & Woehrer, A. (2022). The Digital Brain Tumour Atlas, an open histopathology resource [Data set]. EBRAINS. https://doi.org/10.25493/WQ48-ZGX

[101] Dmitry Nechaev, Alexey Pchelnikov, and Ekaterina Ivanova. Spider: A comprehensive multi-organ supervised pathology dataset and baseline models, 2025. https://doi.org/10.48550/arXiv.2503.02876.

[102] Mark Schuiveling, Hong Liu, Daniel Eek, Gerben E Breimer, Karijn P M Suijkerbuijk, Willeke A M Blokx, Mitko Veta, A novel dataset for nuclei and tissue segmentation in melanoma with baseline nuclei segmentation and tissue segmentation benchmarks, GigaScience, Volume 14, 2025, giaf011, https://doi.org/10.1093/gigascience/giaf011.

[103] Puchalski, Ralph B., et al. "An anatomic transcriptional atlas of human glioblastoma." Science 360.6389 (2018): 660-663.

[104] Dmitry Nechaev, Alexey Pchelnikov, and Ekaterina Ivanova. HISTAI: An Open-Source, Large-Scale Whole Slide Image Dataset for Computational Pathology (2025)

Search

Author

Marie (Duc) Stettler