MCP Tool Support

April 22, 2026 ยท View on GitHub

apfel natively speaks the https://modelcontextprotocol.io/. Attach tool servers with --mcp and apfel discovers tools, executes them, and returns the final answer.

All inference runs on-device with no network calls for the LLM itself. Optional remote MCP tool servers (--mcp https://...) do make network calls for tool arguments.

Ready-made MCPs for apfel: apfel-mcp.franzai.com ships three token-budget-optimized MCP servers designed specifically for apfel's 4096-token context window: url-fetch, ddg-search, and the flagship compound search-and-fetch tool. Install with brew install Arthur-Ficial/tap/apfel-mcp. Repo is open for contributions of new apfel-optimized MCPs.

Quick start

apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
# mcp: ./mcp/calculator/server.py - add, subtract, multiply, divide, sqrt, power, round_number
# tool: multiply({"a": 15, "b": 27}) = 405
# 15 times 27 is 405.

No glue code. No manual round-trip. One command.

All modes

# CLI - one command, answer out
apfel --mcp ./mcp/calculator/server.py "What is 2 to the power of 10?"

# Server - tools auto-available to all clients
apfel --serve --mcp ./mcp/calculator/server.py

# Chat - tools persist across the conversation
apfel --chat --mcp ./mcp/calculator/server.py

# Multiple MCP servers
apfel --mcp ./calc.py --mcp ./weather.py "What is sqrt(2025)?"

# Slow or remote MCP server - increase timeout (default: 5s, max: 300s)
apfel --mcp-timeout 30 --mcp ./remote-server.py "hello"

# No --mcp = exactly as before. Zero overhead.
apfel "Hello"

Persistent MCP registry (apfel-run)

If you find yourself typing the same --mcp list every day, Arthur-Ficial/apfel-run (MIT, ~200 LOC) reads a plain text config at ~/.config/apfel/mcps.conf, builds APFEL_MCP from the enabled lines, and execves apfel. Comment out a line with - to disable, uncomment to re-enable.

# ~/.config/apfel/mcps.conf
/Users/me/mcp/calc.py
/Users/me/mcp/web.py
-/Users/me/mcp/filesystem.py   # disabled

apfel-run "What is 15 times 27?"    # same as apfel + all enabled MCPs
apfel-run --list                    # see what's on / off

This keeps apfel itself flag-only - the registry layer lives in its own 200-LOC wrapper.

Remote MCP servers

Remote MCP uses Streamable HTTP transport (MCP spec 2025-03-26):

# Remote MCP server over HTTPS
apfel --mcp https://mcp.example.com/v1 "what tools do you have?"

# With bearer token auth - prefer the env var (flag is visible in ps aux)
APFEL_MCP_TOKEN=mytoken apfel --mcp https://mcp.example.com/v1 "..."
apfel --mcp https://mcp.example.com/v1 --mcp-token mytoken "..."

# Mixed local + remote
apfel --mcp /path/to/local.py --mcp https://remote.example.com/v1 "..."

Security: Prefer APFEL_MCP_TOKEN over --mcp-token because CLI flags are visible in ps aux. apfel refuses to send a bearer token over plaintext http://; use https://.

Calculator tools

Ships at mcp/calculator/server.py. Zero dependencies (Python stdlib only).

ToolExampleResult
addadd(a=10, b=3)13
subtractsubtract(a=10, b=3)7
multiplymultiply(a=247, b=83)20501
dividedivide(a=1000, b=7)142.857...
sqrtsqrt(a=2025)45
powerpower(a=2, b=10)1024
round_numberround_number(a=3.14159, decimals=2)3.14

Real examples

Five real round trips, unedited.

$ apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
tool: multiply({"a": 15, "b": 27}) = 405
15 times 27 is 405.

$ apfel --mcp ./mcp/calculator/server.py "What is the square root of 2025?"
tool: sqrt({"number": 2025}) = 45
The square root of 2025 is 45.

$ apfel --mcp ./mcp/calculator/server.py "Divide 1000 by 7"
tool: divide({"numerator": 1000, "denominator": 7}) = 142.857...
When you divide 1000 by 7, the result is approximately 142.857.

$ apfel --mcp ./mcp/calculator/server.py "What is 2 to the power of 10?"
tool: power({"base": 2, "exponent": 10}) = 1024
2 to the power of 10 is 1024.

$ apfel --mcp ./mcp/calculator/server.py "Add 999 and 1"
tool: add({"a": 999, "b": 1}) = 1000
The result of adding 999 and 1 is 1000.

Note: the model sends different argument key names each time (a/b, number, base/exponent, numerator/denominator). The calculator handles all of these by extracting numbers from any key.

How it works

apfel --mcp ./calc.py "What is 15 times 27?"
  |
  v
1. Spawn MCP server (stdio subprocess)
2. Initialize (JSON-RPC handshake)
3. tools/list -> discover: add, subtract, multiply, divide, sqrt, power, round_number
  |
  v
4. Ask Apple's LLM with tools defined
5. Model returns: multiply({"a": 15, "b": 27})
  |
  v
6. tools/call via MCP -> result: 20501
  |
  v
7. Re-prompt model with full conversation context + tool result
8. Model answers: "15 times 27 is 405."

Server mode

When running apfel --serve --mcp ./calc.py, the server auto-injects MCP tools for clients that don't send their own:

  • Client sends tools -> client's tools used, returned as finish_reason: "tool_calls" (standard OpenAI behavior, client handles execution)
  • Client sends NO tools -> MCP tools injected, server auto-executes tool calls and returns the final text answer with finish_reason: "stop"

MCP auto-execution preserves full conversation context: the server appends the tool call and result as proper assistant/tool messages before re-prompting, so multi-turn conversations work correctly.

Build your own MCP server

A minimal MCP server is a Python script that reads JSON-RPC from stdin and writes to stdout:

#!/usr/bin/env python3
import json, sys

def read():
    line = sys.stdin.readline()
    return json.loads(line.strip()) if line else None

def send(msg):
    sys.stdout.write(json.dumps(msg) + "\n")
    sys.stdout.flush()

def respond(id, result):
    send({"jsonrpc": "2.0", "id": id, "result": result})

while True:
    msg = read()
    if not msg:
        break
    method = msg.get("method", "")
    id = msg.get("id")

    if method == "initialize":
        respond(id, {
            "protocolVersion": "2025-06-18",
            "capabilities": {"tools": {}},
            "serverInfo": {"name": "my-tool", "version": "1.0.0"}
        })
    elif method == "notifications/initialized":
        pass
    elif method == "tools/list":
        respond(id, {"tools": [{
            "name": "my_tool",
            "description": "What it does",
            "inputSchema": {
                "type": "object",
                "properties": {"input": {"type": "string"}},
                "required": ["input"]
            }
        }]})
    elif method == "tools/call":
        args = msg["params"]["arguments"]
        result = "your result here"
        respond(id, {
            "content": [{"type": "text", "text": result}],
            "isError": False
        })
    elif method == "ping":
        respond(id, {})

Then use it:

apfel --mcp ./my-tool.py "question that needs the tool"

Tips for Apple's ~3B model

  • Use multiple simple tools instead of one complex tool. The model picks function names well but improvises argument structures.
  • Keep descriptions short with an example: "Add two numbers. Example: add(a=10, b=3) returns 13".
  • Use simple types. number and string work best. Nested objects and enums are unreliable.
  • Tolerate improvised keys. The model might send {"number1": 5} instead of {"a": 5}.
  • Name tools as verbs. multiply, search, translate - not math_operation.

Limitations

  • 4096 token context window. Tool definitions, question, tool result, and final answer must all fit.
  • One tool call per turn. Multi-tool chains require multiple round trips.
  • No guaranteed schema compliance. The model follows schemas loosely. Your server must handle unexpected argument formats.
  • No streaming for tool calls. Tool call responses are always non-streaming.
  • Default 5s timeout. MCP startup and tool calls timeout after 5 seconds. Use --mcp-timeout <seconds> or APFEL_MCP_TIMEOUT for slow/remote servers.
  • Safety guardrails apply. Apple's content filters can block tool calls containing flagged words.

MCP protocol reference

Transport: stdio (JSON-RPC 2.0, one message per line).

MethodDirectionResponse
initializeclient -> serverRequired
notifications/initializedclient -> serverNone (notification)
tools/listclient -> serverRequired
tools/callclient -> serverRequired
pingclient -> serverEmpty result

See mcp/calculator/server.py for a complete working example.

Ready-made MCPs

  • apfel-mcp.franzai.com - three token-budget-optimized MCP servers for apfel's 4096-token context window:
    • apfel-mcp-url-fetch - fetch a URL, extract the main article with Readability, return clean Markdown. SSRF blocklist, 6000-char hard cap.
    • apfel-mcp-ddg-search - DuckDuckGo web search via direct HTML scrape. No API key. 2000-char hard cap.
    • apfel-mcp-search-and-fetch - the flagship compound tool. Searches AND fetches the top N result pages in ONE tool call. Saves ~500 tokens of schema/state overhead vs chaining separate tools. Declared as both search and web_search so the 3B model's hallucinated tool names still route correctly.
    • Install with brew install Arthur-Ficial/tap/apfel-mcp
    • Repo: github.com/Arthur-Ficial/apfel-mcp - open for contributions of new apfel-optimized MCPs. See apfel-mcp.franzai.com/#contribute for the rules and idea list.