From Pixels to Cells: Rethinking Cell Detection in Histopathology with Transformers

June 20, 2025 · View on GitHub

Description

Accurate and efficient cell nuclei detection and classification in histopathological Whole Slide Images (WSIs) are essential for enabling large-scale, quantitative pathology workflows. While segmentation-based methods are commonly used for this task, they introduce substantial computational overhead and produce detailed masks that are often unnecessary for clinical interpretation. In this work, we propose a paradigm shift from segmentation to direct detection, introducing CellNuc-DETR, a transformer-based model that localizes and classifies nuclei without relying on segmentation masks or expensive post-processing. By focusing on clinically relevant outputs—nuclear location and type—CellNuc-DETR achieves significant gains in both accuracy and inference speed. To scale to full-slide inference, we develop a novel strategy that partitions feature maps, rather than images, enabling efficient processing of large tiles with improved context aggregation. We evaluate CellNuc-DETR on PanNuke, CoNSeP, and MoNuSeg, demonstrating state-of-the-art performance, strong generalization across tissue and stain variations, and up to 10x faster inference than segmentation-based methods. Our approach bridges the gap between generic detection frameworks and the practical demands of digital pathology, offering a scalable, accurate, and clinically viable solution for cell-level analysis in WSIs.

Table of Contents

Installation

  1. Create a Virtual environment
python3 -m venv myenv
  1. Activate the environment
source myenv/bin/activate
  1. Install torch>=2.1.1 and torchvision>=0.16.1. We've used cuda 11.8. You can find more information in the official website.
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu118
  1. Install the other requirements
pip3 install -r requirements.txt
  1. Build the MultiScaleDeformAttn module (you will need GPU, follow the authors' instructions)
cd celldetr/models/deformable_detr/ops
./make.sh

Usage

Project structure

celldetr/
|-- tools/                   # Contains scripts for training, evaluation, and inference
|   |-- train.py             # Training on COCO format dataset
|   |-- eval.py              # Evaluation on COCO format dataset
|   |-- finetune.py          # Fine-tuning on COCO dataset, can be split into stages.
|   |-- infer.py             # Inference on WSIs
|-- eval/                    # Evaluation module for evaluating model performance (COCO and Cell detection)
|-- util/                    # Utility functions and modules used throughout the project
|-- data/                    # Datasets, transforms and augmentations for cell detection
|-- models/                  # Deep learning models used in CellDetr
|   |-- deformable_detr/     # Implementation of Deformable DETR model
|   |-- window/              # Implementation of window-based DETR model
|   |-- backbone/            # Backbone networks used in the models
|-- engine.py                # Main engine script for coordinating the training and evaluation process

Configuration files

The entire code is based on yaml configuration files. You can find an explanation of how they work in config.md.

  • In the configs/base/ folder you can find default configurations for the datasets, loss functions, model architectures, and others.
  • In the configs/experiments/ folder you can find the configuration files for the experiments reported in the paper.

If you want to use your own data, models, or others, consider writing the re-usable independent configuration pieces such as datasets into configs/base/ and then the configuration for your experiments in configs/experiments/.

Pre-trained models

The pre-trained model we provide for inference is trained on 80% of PanNuke, taking samples from each of the folds. The detection F1-Score corresponds to the remaining 20% of the data.

ResolutionBackboneBackbone levelsDETR levelsLayers/QueriesF1-detconfigweights
0.25mppSwin-T343/60082.67configweights
0.50mppSwin-T443/60081.77configweights
0.25mppSwin-L446/90083.06configweights

Training and testing

For training and testing, you must create a configuration file in which you specify basic configuration of the experiment, the dataset, the model, the loss and the loaders. You can find an example on this config file. To run the training, we recommend creating your own configuration file, in which you'll have the modify the data directory and model checkpoints, and run:

python3 -m torch.distributed.launch --use-env --nproc-per-node=NUM_GPUs tools/train.py \
                        --config-file /path/to/config/file

Alternatively, you could also modify the paths from the command line:

python3 -m torch.distributed.launch --use-env --nproc-per-node=NUM_GPUs tools/train.py \
                        --config-file /path/to/config/file \
                        --opts dataset.train.root=/path/to/dataset/ \
                               dataset.val.root=/path/to/dataset/ \
                               dataset.test.root=/path/to/dataset/ \
                               model.checkpoint=/path/to/checkpoint \
                               model.backbone.checkpoint=/path/to/backbone/checkpoint

Testing is based on the same configuration file that has been used for training, but calling the tools/eval.py script rather than the training one. Note that for the evaluation, the training must have been run previously and ended successfully. Now, the checkpoints specified to the model and backbone configurations will be ignored, but the output checkpoint will be used when initializing the model.

Inference on WSIs

Inference on WSIs is very easy! You just need to create a configuration file that extends (with __base__) an existing configuration file provided jointly with the model weights (see Pre-trained models table above), the model window parameters (for window detection) and the infer loader configuration. See docs/inference.md.

Then, running is as easy as:

python3 -m torch.distributed.launch --use-env --nproc-per-node=NUM_GPUs tools/infer.py \
                        --config-file /path/to/config/file \
                        --experiment.output_dir /path/model/weights/folder \
                        --experiment.output_name model_weights_name.pth

Citation

If you find this work helpful in your research, please consider citing us:

@misc{pina2025cellnucleidetectionclassification,
      title={Cell Nuclei Detection and Classification in Whole Slide Images with Transformers}, 
      author={Oscar Pina and Eduard Dorca and Verónica Vilaplana},
      year={2025},
      eprint={2502.06307},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.06307}, 
}
@inproceedings{
       pina2024celldetr,
       title={Cell-{DETR}: Efficient cell detection and classification in {WSI}s with transformers},
       author={Oscar Pina and Eduard Dorca and Veronica Vilaplana},
       booktitle={Submitted to Medical Imaging with Deep Learning},
       year={2024},
       url={https://openreview.net/forum?id=H4KbJlAHuq},
       note={under review} }

License

License

This project is licensed under the MIT License.