MCPSafari: Native Safari MCP Server for AI Agents

June 6, 2026 · View on GitHub

MCPSafari: Native Safari MCP Server for AI Agents

Stars MCP macOS Swift Xcode

Give Claude, Cursor, or any MCP-compatible AI full native control of Safari on macOS. Navigate tabs, click/type/fill forms (even React), read HTML/accessibility trees, execute JS, capture screenshots, inspect console & network — all with 23 secure tools. Zero Chrome overhead, Apple Silicon optimized, token-authenticated, and built with official Swift + Manifest V3 Safari Extension.

Why MCPSafari?

  • Smarter element targeting (UID + CSS + text + coords + interactive ranking)
  • Works flawlessly with complex sites
  • Local & private (runs on your Mac)
  • Perfect drop-in for Mac-first agent workflows

macOS 14+Safari 17+Xcode 26+

Built with the official swift-sdk and a Manifest V3 Safari Web Extension.

Why Safari over Chrome?

  • 40–60% less CPU/heat on Apple Silicon
  • Keeps your existing Safari logins/cookies
  • Native accessibility tree (better than Playwright for complex UIs)

How It Works

MCP Client (Claude, etc.)
        │ stdio
┌───────▼──────────────┐
│  Swift MCP Server    │
│  (MCPSafari binary)  │
└───────┬──────────────┘
        │ WebSocket (localhost:8089)
┌───────▼──────────────┐
│  Safari Extension    │
│  (background.js)     │
└───────┬──────────────┘
        │ content scripts
┌───────▼──────────────┐
│  Safari Browser      │
│  (macOS 14.0+)       │
└──────────────────────┘

The MCP server communicates with clients over stdio and bridges tool calls to the Safari extension over a local WebSocket. The extension executes actions via browser APIs and content scripts injected into pages.

Requirements

  • macOS 14.0 (Sonoma) or later
  • Safari 17+
  • Swift 6.3+ (for building from source)
  • Xcode 26+ (for building the Safari extension)

Installation

Installs the MCP server binary and the Safari extension app to /Applications in one step. Automatically cleans up any previous installation.

brew install --cask epistates/tap/mcp-safari

Upgrading:

brew upgrade --cask epistates/tap/mcp-safari

After install, enable the extension in Safari > Settings > Extensions > MCPSafari Extension.

From Release

If you don't use Homebrew, download both the CLI binary and the extension app from GitHub Releases:

AssetDescription
MCPSafari-Server-arm64-apple-darwinMCP server binary for Apple Silicon (M1, M2, M3, M4)
MCPSafari-Server-x86_64-apple-darwinMCP server binary for Intel Macs
MCPSafari-Server-universal-apple-darwinMCP server binary — universal, runs on any Mac
MCPSafari-Extension-arm64.tar.gzSafari extension app for Apple Silicon (M1, M2, M3, M4)
MCPSafari-Extension-x86_64.tar.gzSafari extension app for Intel Macs
# Apple Silicon (M1/M2/M3/M4) — use x86_64 for Intel Macs
curl -L -o /usr/local/bin/mcp-safari https://github.com/Epistates/MCPSafari/releases/latest/download/MCPSafari-Server-arm64-apple-darwin
chmod +x /usr/local/bin/mcp-safari

# Safari extension (must be in /Applications for macOS 26+)
curl -L https://github.com/Epistates/MCPSafari/releases/latest/download/MCPSafari-Extension-arm64.tar.gz | tar xzf -
mv MCPSafari.app /Applications/
open /Applications/MCPSafari.app

Then enable the extension in Safari > Settings > Extensions > MCPSafari Extension.

From Source

git clone https://github.com/Epistates/MCPSafari.git
cd MCPSafari

# Build the MCP server
cd MCPServer
swift build -c release
# Binary is at .build/release/MCPSafari

# Build and open the Safari extension
cd ../MCPSafari
xcodebuild -project MCPSafari.xcodeproj -scheme MCPSafari build
open ~/Library/Developer/Xcode/DerivedData/MCPSafari-*/Build/Products/Debug/MCPSafari.app

Then enable the extension in Safari > Settings > Extensions > MCPSafari Extension.

Configuration

Codex CLI

Register the server with Codex CLI:

codex mcp add mcp-safari -- mcp-safari

If mcp-safari is not in your $PATH, use the full path to the server binary:

codex mcp add mcp-safari -- /usr/local/bin/mcp-safari

Verify the registration:

codex mcp list
codex mcp get mcp-safari

Start a new Codex CLI session after registering the server. MCPSafari uses the MCP stdio transport, so use -- before the command; --url is only for streamable HTTP MCP servers.

Claude Code

Register the server with the Claude Code CLI (user scope, available in every project):

claude mcp add --scope user mcp-safari mcp-safari

Or, to scope it to a single repo, create .mcp.json at the project root:

{
  "mcpServers": {
    "mcp-safari": {
      "command": "mcp-safari"
    }
  }
}

Verify with claude mcp list — you should see mcp-safari — ✓ Connected.

Note: Claude Code's CLI does not read mcpServers from ~/.claude/settings.json — that's the Claude Desktop format. Pasting the JSON snippet above into settings.json is silently ignored (no error, no registration), and the Safari extension will appear stuck at "disconnected" because the server is never spawned. Use claude mcp add or .mcp.json as shown above.

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "mcp-safari": {
      "command": "mcp-safari"
    }
  }
}

Cursor / Windsurf / Other MCP Clients

Any client that supports the MCP stdio transport can connect. Point it at mcp-safari (or the full path if not in $PATH).

Multiple Claude Instances

Multiple MCP clients work automatically. The server auto-finds a free port if the default (8089) is in use, and the extension auto-discovers all servers in the 8089-8098 range. No configuration needed — just start multiple clients and they each get their own connection.

For ports outside the default range, add them manually in the extension popup or specify explicitly:

{
  "mcpServers": {
    "mcp-safari": {
      "command": "mcp-safari",
      "args": ["--port", "9090"]
    }
  }
}

CLI Options

FlagDescription
--port <n> / -p <n>WebSocket port (default: 8089)
--verboseDebug-level logging to stderr

Tools (23)

Tab Management

ToolDescription
tabs_contextList all open tabs with IDs, URLs, and titles
tabs_createOpen a new tab, optionally with a URL
close_tabClose a tab by ID
select_tabPin a tab as the default context for future calls
ToolDescription
navigateGo to a URL, or use back / forward / reload actions

Page Reading

ToolDescription
read_pageGet page content as text, html, or snapshot
snapshotAccessibility tree with element UIDs for interaction
findFind elements by CSS selector, text, or ARIA role

Interaction

ToolDescription
clickClick by UID, CSS selector, text, or coordinates
type_textType into an element with optional clearFirst and submitKey
form_inputBatch fill form fields (CSS selector → value map)
select_optionSelect a dropdown option by value or label
scrollScroll page or element in any direction
press_keyPress key combinations (e.g., Enter, Meta+a, Control+c)
hoverHover to trigger tooltips, menus, or hover states
dragDrag and drop between elements

Note: Interaction tools drive elements via synthetic DOM events (and React-compatible value setting for text), which works across the vast majority of sites. Because the events are synthetic, press_key modifier combos (e.g. Meta+a, Control+c) and drag reach page-level JS handlers but do not trigger native browser actions — clipboard copy/paste, select-all, or HTML5 native drag-and-drop. Use type_text/form_input for text entry and javascript_tool when a true native action is required.

Dialogs

ToolDescription
handle_dialogAccept or dismiss alerts, confirms, and prompts

Screenshots

ToolDescription
screenshotCapture the visible tab area as a PNG image

JavaScript

ToolDescription
javascript_toolExecute arbitrary JS in the page context and return expression results

Debugging

ToolDescription
read_consoleRead console messages with level and regex filtering
read_networkRead captured XHR/fetch requests with type filtering

Note: Console and network capture run in the page's main world (required to patch console, fetch, and history). A hostile page can therefore observe or forge this telemetry, so treat read_console / read_network output from untrusted pages as page-controlled data rather than ground truth.

Window

ToolDescription
resize_windowResize the browser window to specific dimensions

Utility

ToolDescription
waitWait for a duration, CSS selector, or text to appear

Usage

Basic Workflow

  1. Start with context — call tabs_context to see what's open, or navigate to a URL.
  2. Take a snapshot — call snapshot to get the accessibility tree with element UIDs.
  3. Interact — use UIDs from the snapshot with click, type_text, hover, etc.
  4. Verify — pass includeSnapshot: true on interaction tools to see the updated state, or take a screenshot.

Element Targeting

Tools that interact with elements accept multiple targeting strategies:

StrategyExampleWhen to Use
UIDuid: "e42"Most precise — from a snapshot
CSS selectorselector: "#login-btn"When you know the DOM structure
Texttext: "Sign In"Interactive elements are ranked higher
Coordinatesx: 100, y: 200Last resort — click at exact position

Form Filling

Use form_input to fill multiple fields at once:

{
  "fields": {
    "#name": "Jane Doe",
    "#email": "jane@example.com",
    "textarea[name=message]": "Hello!"
  }
}

This uses React-compatible value setting (nativeInputValueSetter) so it works with controlled inputs in React, Next.js, and similar frameworks.

Smart Text Matching

When targeting by text, interactive elements (buttons, links, inputs) are ranked higher than generic containers. Clicking text: "Submit" will prefer a <button>Submit</button> over a <div>Submit</div>.

Post-Action Snapshots

Most interaction tools support includeSnapshot: true, which returns the updated accessibility tree after the action — useful for verifying the result without a separate snapshot call.

Post-Action Waits

navigate and interaction tools support waitForSelector, waitForText, and waitTimeout to wait after a successful action before returning. When combined with includeSnapshot: true, the snapshot is captured after the wait.

Page Traces

Interaction tools support trace: true and traceDuration to return a short page trace after the action. Traces include URL/history changes, console messages, fetch/XHR requests, and DOM mutations captured during the action window.

Architecture

MCP Server (MCPServer/)

A Swift executable using the official modelcontextprotocol/swift-sdk. Communicates with MCP clients via stdio and with the Safari extension via a WebSocket bridge using Network.framework.

  • main.swift — Entry point, parses CLI flags, starts the server
  • SafariMCPServer.swift — Tool definitions and handlers (actor)
  • WebSocketBridge.swift — WebSocket server with request/response correlation (actor)
  • BridgeMessage.swift — Wire protocol types and AnyCodable serialization

Safari Extension (MCPSafari/)

A Manifest V3 Safari Web Extension with:

  • background.js — WebSocket client, request router, tab/navigation/screenshot handlers
  • content.js — DOM interaction, accessibility snapshots, element finding, click/type/scroll simulation
  • trace-interceptor.js — Captures action-window URL, history, console, network, and DOM mutation events
  • dialog-interceptor.js — Patches window.alert/confirm/prompt before page scripts run
  • console-interceptor.js — Captures console messages for read_console
  • network-interceptor.js — Captures XHR/fetch requests for read_network
  • popup.html/js/css — Extension popup showing connection status

macOS Host App

A minimal macOS app (AppDelegate.swift, ViewController.swift) that registers the Safari extension and provides native messaging for auth token exchange.

Security

WebSocket Authentication

The server generates a random UUID token at startup, writes it to ~/.config/mcp-safari/tokens/<port> (mode 0600), and requires it as the first WebSocket message before any MCP tool traffic is sent. The extension reads the per-port token map via native messaging from the host app, so multiple server instances can authenticate independently. Connections without a valid token are closed.

Input Validation

  • URL schemes restricted to http, https, about, and file
  • Navigation actions validated against an allowlist
  • Regex patterns capped at 200 characters and validated before forwarding
  • Wait durations capped at 300 seconds

Permissions

The extension requests these permissions in manifest.json:

PermissionPurpose
tabsList and manage tabs
activeTabAccess the active tab
scriptingInject content scripts and execute JS
webNavigationNavigate tabs (back/forward/reload)
nativeMessagingAuth token exchange with host app
alarmsService worker keepalive
storagePersist selected tab across suspensions

Troubleshooting

Extension shows "Disconnected"

  1. Make sure the MCP server is running (check your MCP client logs)
  2. Verify port 8089 is not in use: lsof -i :8089
  3. Click "Reconnect" in the extension popup
  4. Use --verbose flag on the server for debug logs

"Could not establish connection" errors

The content scripts may not be injected yet. The extension auto-injects on first interaction, but you can also reload the page.

Safari permission prompts

Safari prompts for per-site permissions the first time the extension interacts with a domain. Click "Always Allow on Every Website" in Safari > Settings > Extensions > MCPSafari Extension to avoid repeated prompts.

Port already in use

Use --port to pick a different port:

{
  "mcpServers": {
    "mcp-safari": {
      "command": "mcp-safari",
      "args": ["--port", "9090"]
    }
  }
}

Development

Build & Test

# Build the MCP server
cd MCPServer
swift build

# Build the Safari extension
cd MCPSafari
xcodebuild -project MCPSafari.xcodeproj -scheme MCPSafari build

# Run the server with verbose logging
.build/debug/MCPSafari --verbose

CI

The CI workflow runs on every push and PR to main:

  1. Builds the MCP server (swift build)
  2. Tests the MCP handshake (verifies the binary responds to initialize)
  3. Builds the Safari extension (xcodebuild)

Project Structure

MCPSafari/
├── MCPServer/                      # Swift MCP server
│   ├── Package.swift
│   └── Sources/mcp-safari/
│       ├── main.swift
│       ├── SafariMCPServer.swift
│       ├── WebSocketBridge.swift
│       └── BridgeMessage.swift
├── MCPSafari/                      # Xcode project
│   ├── MCPSafari/                  # macOS host app
│   ├── MCPSafari Extension/        # Safari web extension
│   │   ├── Resources/
│   │   │   ├── background.js
│   │   │   ├── content.js
│   │   │   ├── dialog-interceptor.js
│   │   │   ├── console-interceptor.js
│   │   │   ├── network-interceptor.js
│   │   │   ├── manifest.json
│   │   │   └── popup.html/js/css
│   │   └── SafariWebExtensionHandler.swift
│   └── MCPSafari.xcodeproj
├── .github/workflows/
│   ├── ci.yml
│   └── release.yml
└── CHANGELOG.md

License

MIT