Serve Module

June 11, 2026 ยท View on GitHub

Import: from selectools.serve.app import create_app Stability: beta

from selectools import Agent, tool
from selectools.providers.stubs import LocalProvider
from selectools.serve.app import create_app

@tool(description="Greet a user by name")
def greet(name: str) -> str:
    return f"Hello, {name}!"

agent = Agent(tools=[greet], provider=LocalProvider())
app = create_app(agent, playground=True)
app.serve(port=8000)

!!! tip "See Also" - Visual Agent Builder -- drag-drop graph editor served at /builder - Templates Module -- YAML config and pre-built agent templates - Agent Module -- the Agent class that powers the server

Added in: v0.19.0 Package: src/selectools/serve/ Classes: AgentRouter, AgentServer Functions: create_app()

Table of Contents

  1. Overview
  2. Quick Start
  3. Agent-as-API (Production REST)
  4. CLI Commands
  5. Endpoints
  6. Streaming (SSE)
  7. Playground UI
  8. Python API
  9. FastAPI Integration
  10. Flask Integration
  11. Configuration Options
  12. Request / Response Models
  13. API Reference
  14. Examples

Overview

The serve module turns any selectools Agent into an HTTP API with one command. No framework boilerplate, no config files, no Docker -- just selectools serve agent.yaml and you have a live endpoint with streaming, a health check, tool schema introspection, and an interactive playground UI.

Why Serve?

selectools serveManual FastAPI setup
Lines of code1 CLI command or 3 lines of Python40+ lines minimum
DependenciesZero (stdlib http.server)fastapi, uvicorn, pydantic
StreamingSSE built-inManual SSE wiring
PlaygroundBuilt-in chat UI at /playgroundBuild your own
SchemaAuto-generated from toolsManual OpenAPI spec

Design Philosophy

  • Zero dependencies. The built-in server uses Python's stdlib http.server. No FastAPI, no Flask, no uvicorn required.
  • Production-ready integrations. When you outgrow the built-in server, AgentRouter drops into FastAPI or Flask with 3 lines of code.
  • Config-driven. Load agents from YAML files or built-in templates. No Python code required for common configurations.

Quick Start

One Command

# Serve from a YAML config
selectools serve agent.yaml

# Serve a built-in template
selectools serve customer_support

# Customize host and port
selectools serve agent.yaml --port 3000 --host 127.0.0.1

# Disable the playground UI
selectools serve agent.yaml --no-playground

Three Lines of Python

from selectools.serve import create_app

app = create_app(agent, playground=True)
app.serve(port=8000)

The server prints its endpoints on startup:

Selectools agent serving at http://0.0.0.0:8000
  POST /invoke   -- single prompt
  POST /stream   -- SSE streaming
  GET  /health   -- health check
  GET  /schema   -- tool schemas
  GET  /playground -- chat UI

Press Ctrl+C to stop.

Agent-as-API (Production REST)

Import: from selectools.serve import AgentAPI Stability: beta Added in: v0.24.0 Requires: pip install selectools[serve] (Starlette + uvicorn)

AgentAPI auto-generates a production REST API from any Agent (or several). The instance is a Starlette ASGI app -- run it with uvicorn, hypercorn, or any ASGI server.

from selectools.serve import AgentAPI

app = AgentAPI(agents=[my_agent, my_other_agent], auth_key="sk-...")
# Run with: uvicorn api:app --port 8000

Or straight from a YAML config:

selectools serve agent.yaml --api --port 8000

Generated Endpoints

MethodPathDescription
POST/v1/chatSingle-turn completion (JSON)
POST/v1/chat/streamStreaming completion (SSE)
POST/v1/sessionsCreate a session
GET/v1/sessions/{id}Get session history
DELETE/v1/sessions/{id}Delete a session
GET/v1/healthHealth check (never requires auth)

Request / Response Schema

curl -X POST http://localhost:8000/v1/chat \
  -H "Authorization: Bearer sk-..." \
  -H "user_id: alice" \
  -H "Content-Type: application/json" \
  -d '{"input": "Hello!", "session_id": "optional", "agent": "optional-name"}'
{
  "output": "Hi there!",
  "session_id": "01bed06e7cf747d9...",
  "agent": "support",
  "usage": {"prompt_tokens": 12, "completion_tokens": 8, "total_tokens": 20, "cost_usd": 0.0001}
}

Errors use a standardized envelope with proper status codes (401 unauthorized, 404 unknown session/agent, 422 validation):

{"error": {"message": "Session 'abc' not found", "type": "not_found"}}

Auth, Users, and Sessions

  • Auth: pass auth_key="sk-..." and every route except /v1/health requires Authorization: Bearer <key>.
  • Per-user isolation: clients send a user_id header (or x-user-id). Sessions are namespaced per user -- one user can never read, write, or delete another user's sessions. Requests without the header share the "default" namespace.
  • Trust model: user_id is a self-asserted header, not an authenticated identity. All callers share the single auth_key, so isolation protects against accidental cross-tenant reads, not malicious clients. Deploy AgentAPI behind your own backend (which authenticates end users and sets user_id server-side); do not expose it directly to untrusted clients that can choose their own headers.
  • Session persistence: pass any SessionStore backend (JsonFileSessionStore, SQLiteSessionStore, RedisSessionStore, SupabaseSessionStore) via session_store=. Defaults to an in-memory store.
  • Multi-agent: AgentAPI(agents=[a, b]) routes by config.name via the optional "agent" request field; the first agent is the default.

Streaming Events

POST /v1/chat/stream emits SSE events:

data: {"type": "chunk", "content": "Hi"}
data: {"type": "chunk", "content": " there!"}
data: {"type": "result", "output": "Hi there!", "session_id": "...", "agent": "support", "usage": {...}}
data: [DONE]

See examples/97_agent_as_api.py for a complete offline runnable demo.


CLI Commands

selectools serve

Start an agent HTTP server from a YAML config file or template name.

selectools serve <config> [--port PORT] [--host HOST] [--no-playground]
ArgumentDefaultDescription
config(required)Path to YAML config file, or a template name (customer_support, data_analyst, etc.).
--port8000Port number.
--host0.0.0.0Bind address. Use 127.0.0.1 for local-only.
--no-playgroundFalseDisable the playground chat UI.

When config is a template name (e.g. customer_support), the CLI auto-detects an API key from environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, or GOOGLE_API_KEY) and creates the provider automatically.

selectools doctor

Diagnose API keys, optional dependencies, and provider connectivity.

selectools doctor

Output:

Selectools Doctor
========================================
Version: 0.19.0
Python: 3.12.0

API Keys:
  OPENAI_API_KEY: OK
  ANTHROPIC_API_KEY: MISSING
  GOOGLE_API_KEY: MISSING
  GEMINI_API_KEY: MISSING

Optional Dependencies:
  fastapi: OK (FastAPI serving)
  flask: not installed (Flask serving)
  redis: OK (Redis cache/sessions)
  chromadb: not installed (Chroma vector store)
  ...

Provider Connectivity:
  OpenAI: OK (connected)
  Anthropic: skipped (no key)
  Gemini: skipped (no key)

Diagnosis complete.

Endpoints

POST /invoke

Send a single prompt and receive a JSON response.

Request:

{
  "prompt": "What is the capital of France?"
}

Response:

{
  "content": "The capital of France is Paris.",
  "tool_calls": [],
  "reasoning": null,
  "iterations": 1,
  "tokens": 42,
  "cost_usd": 0.00012,
  "run_id": "run-abc123"
}

cURL example:

curl -X POST http://localhost:8000/invoke \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is the capital of France?"}'

POST /stream

Send a prompt and receive an SSE (Server-Sent Events) stream. Each event is a JSON object with a type field.

Request: Same as /invoke.

Response stream:

data: {"type": "chunk", "content": "The capital"}
data: {"type": "chunk", "content": " of France"}
data: {"type": "chunk", "content": " is Paris."}
data: {"type": "result", "content": "The capital of France is Paris.", "iterations": 1}
data: [DONE]

GET /health

Health check endpoint. Returns agent status, version, model, provider, and available tools.

Response:

{
  "status": "ok",
  "version": "0.19.0",
  "model": "gpt-4o",
  "provider": "openai",
  "tools": ["read_file", "write_file", "web_search"]
}

GET /schema

Returns JSON schemas for all tools registered with the agent.

Response:

{
  "model": "gpt-4o",
  "tools": [
    {
      "name": "read_file",
      "description": "Read a file from disk",
      "parameters": {
        "type": "object",
        "properties": {
          "path": {"type": "string", "description": "File path to read"}
        },
        "required": ["path"]
      }
    }
  ]
}

GET /playground

Interactive chat UI served as a single HTML page. See Playground UI below.

GET /

Redirects to /playground when the playground is enabled.


Streaming (SSE)

The /stream endpoint uses Server-Sent Events for real-time token streaming. The agent's astream() method powers this -- each token chunk is forwarded as an SSE event.

Event Types

TypeDescription
chunkA text fragment from the LLM. Concatenate all chunks for the full response.
resultFinal result with content, iteration count. Sent once at the end.
[DONE]Stream termination signal.

JavaScript Client

const response = await fetch("http://localhost:8000/stream", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ prompt: "Explain quantum computing" }),
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const text = decoder.decode(value);
  for (const line of text.split("\n")) {
    if (line.startsWith("data: ") && line !== "data: [DONE]") {
      const event = JSON.parse(line.slice(6));
      if (event.type === "chunk") {
        process.stdout.write(event.content);
      }
    }
  }
}

Playground UI

When enabled (default), the server serves an interactive chat interface at /playground. The playground is a single self-contained HTML page with no external dependencies.

Features

  • Real-time streaming responses via SSE
  • Conversation history within the session
  • Tool call visibility (shows which tools the agent invoked)
  • Model and provider info displayed in the header
  • Works in any modern browser

The playground is intended for development and testing. For production UIs, build a custom frontend against the /invoke and /stream endpoints.

Disabling

# CLI
selectools serve agent.yaml --no-playground

# Python
app = create_app(agent, playground=False)

Python API

AgentRouter

The AgentRouter class handles request routing and is the core building block for all integrations. It works standalone or embedded in any WSGI/ASGI framework.

from selectools.serve import AgentRouter

router = AgentRouter(agent, prefix="/api/v1", enable_playground=True)

# Use handler methods directly
result = router.handle_invoke({"prompt": "Hello"})
health = router.handle_health()
schema = router.handle_schema()

create_app()

Create a standalone HTTP server with zero dependencies:

from selectools.serve import create_app

app = create_app(
    agent,
    prefix="",           # URL prefix for all endpoints
    playground=True,      # Enable /playground UI
    host="0.0.0.0",      # Bind address
    port=8000,            # Port number
)

app.serve()  # Blocking -- starts the server

FastAPI Integration

Drop AgentRouter into a FastAPI application for production deployments:

from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse, StreamingResponse
from selectools.serve import AgentRouter

app = FastAPI()
router = AgentRouter(agent)

@app.post("/invoke")
async def invoke(request: Request):
    body = await request.json()
    return JSONResponse(router.handle_invoke(body))

@app.post("/stream")
async def stream(request: Request):
    body = await request.json()
    return StreamingResponse(
        router.handle_stream(body),
        media_type="text/event-stream",
    )

@app.get("/health")
async def health():
    return JSONResponse(router.handle_health())

Run with uvicorn for production-grade performance:

uvicorn app:app --host 0.0.0.0 --port 8000 --workers 4

Flask Integration

from flask import Flask, request, jsonify, Response
from selectools.serve import AgentRouter

app = Flask(__name__)
router = AgentRouter(agent)

@app.route("/invoke", methods=["POST"])
def invoke():
    return jsonify(router.handle_invoke(request.json))

@app.route("/stream", methods=["POST"])
def stream():
    return Response(
        router.handle_stream(request.json),
        content_type="text/event-stream",
    )

@app.route("/health")
def health():
    return jsonify(router.handle_health())

Configuration Options

YAML Config File

The recommended way to configure a served agent. See the Templates Module for full YAML reference.

provider: openai
model: gpt-4o
system_prompt: "You are a helpful coding assistant."
tools:
  - selectools.toolbox.file_tools.read_file
  - selectools.toolbox.file_tools.write_file
  - ./my_custom_tool.py
budget:
  max_cost_usd: 1.00
retry:
  max_retries: 3

Environment Variables

The CLI auto-detects providers from environment variables:

VariableProvider
OPENAI_API_KEYOpenAI (checked first)
ANTHROPIC_API_KEYAnthropic
GOOGLE_API_KEY / GEMINI_API_KEYGemini

Request / Response Models

File: src/selectools/serve/models.py

InvokeRequest

FieldTypeDescription
promptstrThe user prompt.
config_overridesOptional[Dict[str, Any]]Override agent config for this request.

InvokeResponse

FieldTypeDescription
contentstrAgent response text.
tool_callsList[Dict]Tools invoked during execution.
reasoningOptional[str]Reasoning trace (when using CoT/ReAct strategies).
iterationsintNumber of agent loop iterations.
tokensintTotal tokens consumed.
cost_usdfloatEstimated cost in USD.
run_idstrUnique run identifier for trace lookup.

HealthResponse

FieldTypeDescription
statusstrAlways "ok" when healthy.
versionstrSelectools version.
modelstrActive model name.
providerstrActive provider name.
toolsList[str]Names of registered tools.

API Reference

AgentRouter.init()

ParameterTypeDefaultDescription
agentAgent(required)The agent to serve.
prefixstr""URL prefix for all endpoints (e.g. "/api/v1").
enable_playgroundboolTrueEnable the /playground chat UI.

AgentRouter Methods

MethodDescription
handle_invoke(body)Process a POST /invoke request. Returns response dict.
handle_stream(body)Process a POST /stream request. Yields SSE-formatted strings.
handle_health()Process a GET /health request. Returns health dict.
handle_schema()Process a GET /schema request. Returns tool schemas dict.

create_app()

ParameterTypeDefaultDescription
agentAgent(required)The agent to serve.
prefixstr""URL prefix for all endpoints.
playgroundboolTrueEnable the /playground chat UI.
hoststr"0.0.0.0"Bind address.
portint8000Port number.

Returns an AgentServer instance. Call .serve() to start (blocking).

AgentServer Methods

MethodDescription
serve(port=None)Start the HTTP server. Blocking. Uses stdlib http.server.

Examples

ExampleFileDescription
6464_selectools_serve.pyServe an agent with the built-in server
6262_yaml_config.pyLoad an agent from YAML config

#FileDescription
6464_selectools_serve.pyServe an agent with the built-in HTTP server
6262_yaml_config.pyLoad an agent from YAML and serve it
6363_agent_templates.pyUse built-in templates with the serve module

Further Reading


Next Steps: Learn about YAML configuration and pre-built templates in the Templates Module.