MedSAM3: Delving into Segment Anything with Medical Concepts

February 6, 2026 · View on GitHub

Anglin Liu1,, Rundong Xue2,, Xu R. Cao3,†, Yifan Shen3, Yi Lu1, Xiang Li3, Qianqian Chen4, Jintai Chen1,5,†

1 The Hong Kong University of Science and Technology (Guangzhou)
2 Xi’an Jiaotong University
3 University of Illinois Urbana-Champaign
4 Southeast University
5 The Hong Kong University of Science and Technology

* Equal Contribution    Corresponding Author

arXiv   Hugging Face

**We will continuously update the documentation and examples to optimize this repository.**

📖 Introduction

MedSAM3-v1 is a pure text-guided (concept-guided) medical image segmentation model. Unlike traditional models that rely on bounding boxes or points, MedSAM3 leverages specific medical concepts to segment targets across a wide range of modalities.

🌟 Key Features & Dataset Statistics

We constructed a large-scale dataset uniformly sampled to ensure diversity and robustness. The model covers diverse medical modalities:

  • Radiology: CT, MRI, PET, X-ray
  • Optical/Microscopic: Microscopy, Histopathology, Dermoscopy, OCT, Cell
  • Video/Procedure: Ultrasound, Endoscopy, Surgery video

Dataset Scale:

  • 658,094 Images
  • 2,863,974 Instance Annotations
  • 330 Unique Medical Text IDs (Concepts)

📦 Model & Weights

We adopted a parameter-efficient fine-tuning strategy based on SAM3 using LoRA (Low-Rank Adaptation).

We are releasing our first version (v1) of the LoRA weights.

Model VersionBase ModelMethodLink
MedSAM3-v1SAM3LoRA Fine-tuningDownload LoRA Weights

🔗 References

This project is built upon the following excellent open-source projects. Please refer to them for the base environment setup. If you encounter code-related issues, please also refer to the specific instructions and documentation provided by these works:

🚀 Inference

Follow these steps to run inference on your medical images.

1. Setup

# Clone repository
git clone https://github.com/Joey-S-Liu/MedSAM3.git
cd MedSAM3

# Install dependencies
pip install -e .

# Login to Hugging Face
hf auth login
# Paste your token when prompted

2. Inference Code

python3 infer_sam.py \
  --config configs/full_lora_config.yaml \
  --image path/to/image.jpg \
  --prompt "skin lesion" \
  --threshold 0.5 \
  --nms-iou 0.5 \
  --output skin_lesion.png

3. Training Code

python3 train_sam3_lora_native.py --config configs/full_lora_config.yaml

⚠️ Notes & Precautions

  1. Hyperparameter Tuning: Please flexibly adjust the threshold and nms-iou parameters according to the specific task type. Different modalities or segmentation targets may require different sensitivity settings (e.g., some tasks achieve optimal results with threshold=0.8, while others work best with threshold=0.5). We recommend using the visualization outputs from infer_sam.py to determine the best settings for your specific task.
  2. Configuration: Please specify the path to your LoRA weights in the configs/full_lora_config.yaml file under the output_dir field.
  3. Data Format: The training data follows the COCO format, which is consistent with the standard SAM3 implementation.
  4. Supported Tasks (v1): The specific list of task categories supported by the current v1 version will be released within a few days. We encourage users to experiment with specific tasks and provide feedback.

📧 Contact

If you have any questions regarding this project, please feel free to contact the corresponding authors:

🖊️ Citation

If you find this project useful for your research, please consider citing:

@misc{liu2025medsam3delvingsegmentmedical,
      title={MedSAM3: Delving into Segment Anything with Medical Concepts}, 
      author={Anglin Liu and Rundong Xue and Xu R. Cao and Yifan Shen and Yi Lu and Xiang Li and Qianqian Chen and Jintai Chen},
      year={2025},
      eprint={2511.19046},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.19046}, 
}