README.md

March 30, 2026 · View on GitHub

SAMKit

Segment Anything, right on your iPhone.

Install · Quick Start · Demo App · Download Models

SAMKit Demo


SAMKit brings Meta's Segment Anything Model to iOS as a native Swift package. Tap, draw, or describe any object to instantly segment it — all inference runs on-device with Core ML, no server required.

Features

  • Point & Box — Tap a point or drag a bounding box to segment any object
  • Text Prompt — Type "dog" or "red cup" to find and segment objects, powered by YOLO-World + CLIP
  • Subject Lift — Long-press to lift the segmented object from the scene, then copy, save, or share as a transparent PNG
  • Two Backbones — MobileSAM (fast, 23 MB) and SAM2 Tiny (accurate, 76 MB)
  • Drop-in UI — Pre-built SwiftUI views for shipping a segmentation feature in minutes
  • Fully On-Device — Neural Engine / GPU acceleration, FP16, zero network calls

Requirements

  • iOS 15.0+
  • Xcode 14.0+
  • Swift 5.7+

Installation

1. Add the Swift Package

dependencies: [
    .package(url: "https://github.com/john-rocky/SamKit.git", from: "1.0.0")
]
ProductWhat it does
SAMKitCore segmentation engine (point / box)
SAMKitGroundingOpen-vocabulary text detection (YOLO-World + CLIP)
SAMKitUIReady-made SwiftUI views

2. Download Models

Grab the .mlpackage files from Releases and drag them into your Xcode project.

MobileSAM — 23 MB (required)
FileSize
mobile_sam_encoder.mlpackage13 MB
mobile_sam_decoder.mlpackage9.8 MB
mobile_sam_prompt_encoder_weights.json40 KB
SAM2 Tiny — 76 MB (optional)
FileSize
SAM2TinyImageEncoderFLOAT16.mlpackage64 MB
SAM2TinyPromptEncoderFLOAT16.mlpackage2.0 MB
SAM2TinyMaskDecoderFLOAT16.mlpackage9.8 MB
Grounding (YOLO-World + CLIP) — 148 MB (optional)
FileSize
clip_text_encoder.mlpackage121 MB
yoloworld_detector.mlpackage25 MB
clip_vocab.json1.6 MB
cv4_params.json4 KB

Quick Start

Point & Box Segmentation

import SAMKit

let session = try SamSession(
    model: .bundled(.mobileSam),
    config: .bestAvailable
)

try session.setImage(cgImage)

let result = try session.predict(
    points: [SamPoint(x: 100, y: 200, label: .positive)]
)

let mask = result.masks.first!   // .cgImage, .alpha, .score

SAM2 Tiny

import SAMKit

let session = try Sam2Session(
    modelName: "SAM2Tiny",
    config: .bestAvailable
)

try session.setImage(cgImage)
let result = try session.predict(
    points: [SamPoint(x: 100, y: 200, label: .positive)]
)

Text-Prompted Segmentation

import SAMKit
import SAMKitGrounding

let session = try TextSegmentationSession(
    groundingModel: .bundled(),
    samModel: .bundled(.mobileSam)
)

try session.setImage(cgImage)
let result = try session.segment(query: "dog, cat")
// result.masks      — segmentation masks
// result.detections — bounding boxes + labels

Subject Lifting

import SAMKit

// After segmentation, extract the object with transparency
let extracted = SamMask.extractObject(from: cgImage, masks: result.masks)
// Returns a CGImage with transparent background — ready for copy/save/share

Architecture

SAMKit/
├── runtime/apple/
│   ├── SAMKit/            # Core inference engine
│   ├── SAMKitGrounding/   # YOLO-World + CLIP text detection
│   └── SAMKitUI/          # SwiftUI components
├── models/converters/     # PyTorch -> Core ML conversion scripts
├── samples/ios-sample/    # Full demo app
└── CLAUDE.md

Sample App

git clone https://github.com/john-rocky/SamKit.git
open samples/ios-sample/SAMKitDemo.xcodeproj

Download models from Releases, add to the project, and run on a physical device.

Model Conversion

Convert from PyTorch checkpoints yourself:

cd models/converters
pip install -r requirements.txt

# MobileSAM
python convert_to_coreml.py --model mobile_sam

# SAM2 Tiny
python convert_sam2_to_coreml.py

# YOLO-World (S/M/L/X)
python convert_yoloworld_to_coreml.py --size s

License

Apache 2.0 — see LICENSE for details.

Acknowledgments