BoxCell: Leveraging SAM for Cell Segmentation with Box Supervision
November 4, 2025 ยท View on GitHub
Official implementation of BoxCell: Leveraging SAM for Cell Segmentation with Box Supervision published in Nature Scientific Reports paper.
๐ Abstract
Cell segmentation in histopathological images is vital for diagnosis, and treatment of several diseases. Annotating data is tedious, and requires medical expertise, making it difficult to employ supervised learning. Instead, we study a weakly supervised setting, where only bounding box supervision is available, and present the use of Segment Anything (SAM) for this without any finetuning, i.e., directly utilizing the pre-trained model. We propose BoxCell, a cell segmentation framework that utilizes SAM's capability to interpret bounding boxes as prompts, both at train and test times. At train time, gold bounding boxes given to SAM produce (pseudo-)masks, which are used to train a standalone segmenter. At test time, BoxCell generates two segmentation masks: (1) generated by this standalone segmenter, and (2) a trained object detector outputs bounding boxes, which are given as prompts to SAM to produce another mask. Recognizing complementary strengths, we reconcile the two segmentation masks using a novel integer programming formulation with intensity and spatial constraints. We experiment on three publicly available cell segmentation datasets namely, CoNSep, MoNuSeg, and TNBC, and find that BoxCell significantly outperforms existing box supervised image segmentation models, obtaining 6-10 point Dice gains.
๐ฌ Key Features
- SAM Integration: Utilizes the powerful Segment Anything Model for generating high-quality segmentation proposals
- Box Supervision: Leverages bounding box annotations for guided segmentation
- ILP Optimization: Employs Integer Linear Programming to select optimal mask combinations
- Multiple Solver Support: Compatible with both commercial (Gurobi) and open-source (CBC, alpha-expansion, OR-tools) solvers
- Comprehensive Evaluation: Tested on multiple biomedical datasets (MoNuSeg, CoNSeP, TNBC)
๐ Repository Structure
BoxCell/
โโโ README.md
โโโ data-preprocessing.py # Dataset preprocessing and format conversion
โโโ training-yolo.py # YOLO model training for box detection
โโโ infering-yolo.py # YOLO inference for generating bounding boxes
โโโ sam-ilp_opensource.py # Main BoxCell algorithm with open-source solvers
โโโ sam-ilp_original.py # Original implementation with Gurobi
โโโ eval-masks.py # Evaluation metrics calculation
โโโ run.sh # Example workflow script
โโโ run_solver.sh # Solver comparison script
โโโ segment_anything/ # SAM model implementation
โโโ modeling/ # SAM architecture components
โโโ utils/ # Utility functions
โโโ ...
๐ Quick Start
Prerequisites
- Python 3.8+
- PyTorch 1.9+
- CUDA-compatible GPU
Installation
- Clone the repository:
git clone https://github.com/Aayushktyagi/BoxCell.git
- Install dependencies:
pip install torch torchvision torchaudio
pip install ultralytics # for YOLO
pip install segment-anything-py
pip install scikit-image opencv-python matplotlib
pip install scipy numpy tqdm
pip install monai
Additionally, install dependencies from SAM
- Install optimization solvers:
For open-source solvers:
pip install python-mip # includes CBC
For Gurobi (requires license):
pip install gurobipy
# Set up Gurobi license as per their documentation
Data and Pretrained Models
Download the datasets, pretrained models, and predictions (ITS and ITD) from our Google Drive.
๐ Datasets
BoxCell is evaluated on three biomedical datasets:
- MoNuSeg: Multi-organ nuclei segmentation dataset
- CoNSeP: Colorectal nuclear segmentation and phenotype dataset
- TNBC: Triple-negative breast cancer dataset
๐ง Usage
1. Data Preprocessing (Optional: You can use preprocessed dataset: Google Drive)
Convert your dataset to the required format:
python data-preprocessing.py \
--loading_dir /path/to/original/dataset \
--saving_dir /path/to/processed/dataset \
--dataset consep \
--replicate True
2. Train YOLO for Box Detection
python training-yolo.py \
--yaml_file /path/to/dataset/train.yaml \
--model_type yolov8x \
--num_epochs 300 \
--batch_size 32 \
--imgsz 500
or weights are available Google Drive
3. Generate Bounding Box Predictions
python infering-yolo.py \
--model_weight_path /path/to/trained/yolo/weights.pt \
--image_dir /path/to/test/images
4. Run BoxCell Segmentation
You can download SAM_Weights
With Gurobi (requires license):
python sam-ilp_original.py \
--img_dir_path /path/to/images \
--box_dir_path /path/to/bounding/boxes \
--model_weights /path/to/sam/weights.pth \
--save_path /path/to/output \
--mode sam-ilp \
--gurobi_license_file /path/to/gurobi.lic
With open-source solvers (recommened OR-solver):
python sam-ilp_opensource.py \
--img_dir_path /path/to/images \
--box_dir_path /path/to/bounding/boxes \
--sam_s_path /path/to/sam/weights.pth \
--save_path /path/to/output \
--solver ortools \
--mu 2 \
--alpha 6 \
--beta 25 \
solver can be: 'ortools', 'alpha_expansion', 'cbc', 'sparse'
5. Evaluate Results
python eval-masks.py \
--gt_masks_dir /path/to/ground/truth \
--pred_masks_dir /path/to/predictions \
--type all
โ๏ธ Key Parameters
--mu: Weight for mask quality term (default: 2)--alpha: Weight for overlap penalty (default: 6)--beta: Weight for coverage reward (default: 25)--lambda_val: Weight for selection regularization (default: 5)--solver: Optimization solver choice (cbc, glpk, scip, gurobi)
๐ Citation
If you use BoxCell in your research, please cite our paper:
@article{tyagi2023guided,
title={Guided Prompting in SAM for Weakly Supervised Cell Segmentation in Histopathological Images},
author={Tyagi, Aayush Kumar and Mishra, Vaibhav and others},
journal={arXiv preprint arXiv:2311.17960},
year={2023}
}
Acknowledgments
- Segment Anything Model (SAM) by Meta AI
- Ultralytics YOLO for object detection