README.md
March 30, 2026 · View on GitHub
Segment Anything, right on your iPhone.
Install · Quick Start · Demo App · Download Models
SAMKit brings Meta's Segment Anything Model to iOS as a native Swift package. Tap, draw, or describe any object to instantly segment it — all inference runs on-device with Core ML, no server required.
Features
- Point & Box — Tap a point or drag a bounding box to segment any object
- Text Prompt — Type
"dog"or"red cup"to find and segment objects, powered by YOLO-World + CLIP - Subject Lift — Long-press to lift the segmented object from the scene, then copy, save, or share as a transparent PNG
- Two Backbones — MobileSAM (fast, 23 MB) and SAM2 Tiny (accurate, 76 MB)
- Drop-in UI — Pre-built SwiftUI views for shipping a segmentation feature in minutes
- Fully On-Device — Neural Engine / GPU acceleration, FP16, zero network calls
Requirements
- iOS 15.0+
- Xcode 14.0+
- Swift 5.7+
Installation
1. Add the Swift Package
dependencies: [
.package(url: "https://github.com/john-rocky/SamKit.git", from: "1.0.0")
]
| Product | What it does |
|---|---|
SAMKit | Core segmentation engine (point / box) |
SAMKitGrounding | Open-vocabulary text detection (YOLO-World + CLIP) |
SAMKitUI | Ready-made SwiftUI views |
2. Download Models
Grab the .mlpackage files from Releases and drag them into your Xcode project.
MobileSAM — 23 MB (required)
| File | Size |
|---|---|
mobile_sam_encoder.mlpackage | 13 MB |
mobile_sam_decoder.mlpackage | 9.8 MB |
mobile_sam_prompt_encoder_weights.json | 40 KB |
SAM2 Tiny — 76 MB (optional)
| File | Size |
|---|---|
SAM2TinyImageEncoderFLOAT16.mlpackage | 64 MB |
SAM2TinyPromptEncoderFLOAT16.mlpackage | 2.0 MB |
SAM2TinyMaskDecoderFLOAT16.mlpackage | 9.8 MB |
Grounding (YOLO-World + CLIP) — 148 MB (optional)
| File | Size |
|---|---|
clip_text_encoder.mlpackage | 121 MB |
yoloworld_detector.mlpackage | 25 MB |
clip_vocab.json | 1.6 MB |
cv4_params.json | 4 KB |
Quick Start
Point & Box Segmentation
import SAMKit
let session = try SamSession(
model: .bundled(.mobileSam),
config: .bestAvailable
)
try session.setImage(cgImage)
let result = try session.predict(
points: [SamPoint(x: 100, y: 200, label: .positive)]
)
let mask = result.masks.first! // .cgImage, .alpha, .score
SAM2 Tiny
import SAMKit
let session = try Sam2Session(
modelName: "SAM2Tiny",
config: .bestAvailable
)
try session.setImage(cgImage)
let result = try session.predict(
points: [SamPoint(x: 100, y: 200, label: .positive)]
)
Text-Prompted Segmentation
import SAMKit
import SAMKitGrounding
let session = try TextSegmentationSession(
groundingModel: .bundled(),
samModel: .bundled(.mobileSam)
)
try session.setImage(cgImage)
let result = try session.segment(query: "dog, cat")
// result.masks — segmentation masks
// result.detections — bounding boxes + labels
Subject Lifting
import SAMKit
// After segmentation, extract the object with transparency
let extracted = SamMask.extractObject(from: cgImage, masks: result.masks)
// Returns a CGImage with transparent background — ready for copy/save/share
Architecture
SAMKit/
├── runtime/apple/
│ ├── SAMKit/ # Core inference engine
│ ├── SAMKitGrounding/ # YOLO-World + CLIP text detection
│ └── SAMKitUI/ # SwiftUI components
├── models/converters/ # PyTorch -> Core ML conversion scripts
├── samples/ios-sample/ # Full demo app
└── CLAUDE.md
Sample App
git clone https://github.com/john-rocky/SamKit.git
open samples/ios-sample/SAMKitDemo.xcodeproj
Download models from Releases, add to the project, and run on a physical device.
Model Conversion
Convert from PyTorch checkpoints yourself:
cd models/converters
pip install -r requirements.txt
# MobileSAM
python convert_to_coreml.py --model mobile_sam
# SAM2 Tiny
python convert_sam2_to_coreml.py
# YOLO-World (S/M/L/X)
python convert_yoloworld_to_coreml.py --size s
License
Apache 2.0 — see LICENSE for details.
Acknowledgments
- Segment Anything & SAM 2 — Meta AI
- MobileSAM — Chaoning Zhang et al.
- YOLO-World — Tencent AILab
- OpenAI CLIP