The AI Backend

May 11, 2026 · View on GitHub

AgentField - The AI Backend

The AI Backend

Build and scale AI agents like APIs. Deploy, observe, and prove.

AI has outgrown chatbots and prompt orchestrators. Backend agents need backend infrastructure.

Stars License Downloads Coverage Last Commit Discord

Docs · Quick Start · Python SDK · Go SDK · TypeScript SDK · REST API · Examples · Discord

Now includes Harness Orchestration — multi-turn coding agents with Claude Code, Codex, Gemini CLI, and OpenCode

AgentField is an open-source control plane that lets you build AI agents callable by any service in your stack - frontends, backends, other agents, cron jobs - just like any other API. You write agent logic in Python, Go, or TypeScript. AgentField turns it into production infrastructure: routing, coordination, memory, async execution, and cryptographic audit trails. Every function becomes a REST endpoint. Every agent gets a cryptographic identity. Every decision is traceable.

from agentfield import Agent, AIConfig
from pydantic import BaseModel

app = Agent(
    node_id="claims-processor",
    version="2.1.0",# Canary deploys, A/B testing, blue-green rollouts
    ai_config=AIConfig(model="anthropic/claude-sonnet-4-20250514"),
)

class Decision(BaseModel):
    action: str# "approve", "deny", "escalate"
    confidence: float
    reasoning: str

@app.reasoner(tags=["insurance", "critical"])
async def evaluate_claim(claim: dict) -> dict:

    # Structured AI judgment - returns typed Pydantic output
    decision = await app.ai(
        system="Insurance claims adjuster. Evaluate and decide.",
        user=f"Claim #{claim['id']}: {claim['description']}",
        schema=Decision,
    )

    if decision.confidence < 0.85:
        # Human approval - suspends execution, notifies via webhook, resumes when approved
        await app.pause(
            approval_request_id=f"claim-{claim['id']}",
            approval_request_url=f"https://internal.acme.com/approvals/claim-{claim['id']}",
            expires_in_hours=48,
        )

    # Route to the next agent - traced through the control plane
    await app.call("notifier.send_decision", input={
        "claim_id": claim["id"],
        "decision": decision.model_dump(),
    })

    return decision.model_dump()

app.run()
# This single line exposes: POST /api/v1/execute/claims-processor.evaluate_claim
# The agent auto-registers with the control plane, gets a cryptographic identity, and every
# execution produces a verifiable, tamper-proof audit trail.

What you just saw: app.ai() calls an LLM and returns structured output. app.pause() suspends for human approval. app.call() routes to other agents through the control plane. app.run() auto-exposes everything as REST. Read the full docs →

Describe the system in one line. Get a production-ready multi-agent backend.

Works in Claude Code, Codex, Gemini CLI, OpenCode, Aider, Windsurf, and Cursor.

curl -fsSL https://agentfield.ai/install.sh | bash

In Claude Code, fire it with the shipped slash command:

/agentfield a claims processor with risk scoring and human approval

Or paste any of these directly — the skill auto-matches, no slash command needed:

Build a claims-processor agent with risk scoring, pattern detection,
and human approval for low-confidence decisions.

Build a research agent that spawns parallel investigators and recurses
into deeper sub-questions until the answer has citation-grade provenance.

Build a compliance reviewer for support transcripts — extract claims,
check each against policy, flag violations, emit a signed audit trail.

You get a Docker Compose stack already wired up — the agent, the control plane, and a local curl you paste into a terminal to try it.

See it in action →

Prefer to write it yourself?

af init my-agent --defaults                            # Scaffold agent
cd my-agent && pip install -r requirements.txt
af server          # Terminal 1 → Dashboard at http://localhost:8080
python main.py     # Terminal 2 → Agent auto-registers
# Call your agent
curl -X POST http://localhost:8080/api/v1/execute/my-agent.demo_echo \
  -H "Content-Type: application/json" \
  -d '{"input": {"message": "Hello!"}}'
Go / TypeScript / Docker
# Go
af init my-agent --defaults --language go && cd my-agent && go run .

# TypeScript
af init my-agent --defaults --language typescript && cd my-agent && npm install && npm run dev

# Docker (control plane only)
docker run -p 8080:8080 agentfield/control-plane:latest

Deployment guide → for Docker Compose, Kubernetes, and production setups.

What You Get

Build - Python, Go, or TypeScript. Every function becomes a REST endpoint.

  • Reasoners & Skills - @app.reasoner() for AI judgment, @app.skill() for deterministic code
  • Structured AI - app.ai(schema=MyModel) → typed Pydantic/Zod output from any LLM
  • Harness - app.harness("Fix the bug") dispatches multi-turn tasks to Claude Code, Codex, Gemini CLI, or OpenCode
  • Cross-Agent Calls - app.call("other-agent.func") routes through the control plane with full tracing
  • Discovery - app.discover(tags=["ml*"]) finds agents and capabilities across the mesh. tools="discover" lets LLMs auto-invoke them.
  • Memory - app.memory.set() / .get() / .search() - KV + vector search, four scopes, no Redis needed

Run - Production infrastructure for non-deterministic AI.

  • Async Execution - Fire-and-forget with webhooks, SSE streaming, retries. No timeout limits - agents run for hours or days.
  • Human-in-the-Loop - app.pause() suspends execution for human approval. Crash-safe, durable, audited.
  • Canary Deployments - Traffic weight routing, A/B testing, blue-green deploys. Roll out agent versions at 5% → 50% → 100%.
  • Observability - Automatic workflow DAGs, Prometheus /metrics, structured logs, execution timeline.

Govern - IAM for AI agents. Identity, access control, and audit trails - built in.

  • Cryptographic Identity - Every agent gets a W3C DID (decentralized identifier) - not a shared API key. Agents authenticate to each other the way services authenticate with mTLS, but with cryptographic signatures that travel with the agent.
  • Verifiable Credentials - Tamper-proof receipt for every execution. Offline-verifiable: af vc verify audit.json.
  • Policy Enforcement - Tag-based policy gates with cryptographic verification. "Only agents tagged 'finance' can call this" - enforced by infrastructure, not prompts.

See the full production-ready feature set →

90+ Production Features

▼ Click to expand full capabilities

AI & LLM

FeatureHow
Structured output (Pydantic/Zod)app.ai(schema=MyModel)
Multi-turn coding agentsapp.harness("task", provider="claude-code")
LLM auto-discovers agents and toolsapp.ai(tools="discover")
Multimodal (text, image, audio)app.ai("Describe", image_url="...")
Streaming responsesapp.ai("...", stream=True)
100+ LLMs via LiteLLMAIConfig(model="anthropic/claude-sonnet-4-20250514")
Temperature, max tokens, formatapp.ai(..., temperature=0.2)

Agent Mesh & Discovery

FeatureHow
Cross-agent calls with tracingapp.call("agent.func", input={...})
Discover agents by tag (wildcards)app.discover(tags=["ml*"])
Discover by health statusapp.discover(health_status="active")
Agent routers (namespacing)AgentRouter(prefix="billing")
Auto context propagationWorkflow, session, actor IDs forwarded
Parallel agent executionasyncio.gather(app.call(...), ...)
Auto-registration on startupService mesh with zero config

Execution Engine

FeatureHow
Sync execution (REST)POST /api/v1/execute/{agent}.{func}
Async (fire-and-forget)POST /api/v1/execute/async/{agent}.{func}
Webhooks + HMAC-SHA256 signingAsyncConfig(webhook_url="...", secret="...")
SSE streaming (real-time)/api/v1/execute/stream/{id}
No timeout limits (hours/days)Control plane allows unlimited duration
Execution pollingGET /api/v1/executions/{id}
Batch status checksPOST /api/v1/executions/batch-status
Progress updates mid-executionIntermediate payloads during long tasks
Auto retries + exponential backoffTransparent - control plane handles
Backpressure + queue depth limitsFair scheduling, circuit breakers
Durable queue (PostgreSQL)Atomic lease-based processing

Memory (Distributed State)

FeatureHow
Key-value storageapp.memory.set(key, value) / .get(key)
Vector search (semantic)app.memory.search(embedding, top_k=5)
Four scopesGlobal, agent, session, run
Reactive memory events@app.memory.on_change("order_*")
Metadata filteringFilter stored values by metadata
Zero dependenciesBuilt into control plane - no Redis

Human-in-the-Loop

FeatureHow
Durable pause/resumeawait app.pause(reason="...")
Approval workflows with UIapproval_request_url for reviewers
Configurable timeoutsexpires_in_hours=24 + auto-escalation
Crash-safe stateSurvives agent restarts

Canary Deployments & Versioning

FeatureHow
Traffic weight routing5% → 50% → 100% rollouts
A/B testing50/50 splits with X-Routed-Version
Blue-green deploymentsInstant weight switch, zero downtime
Per-version health trackingUnhealthy versions auto-removed
Agent lifecycle statespending → starting → ready → degraded → offline

Identity & Governance

FeatureHow
Cryptographic identity per agentAuto-generated W3C DID + Ed25519 keys
Verifiable CredentialsTamper-proof receipt per execution
Offline VC verificationaf vc verify audit.json
Tag-based access policiesALLOW/DENY rules on caller → target tags
Cryptographically signed requestsEd25519 signatures on cross-agent calls
VC hierarchy (3 tiers)Platform → Node → Function control
Agent notes (audit log)app.note("Decision", tags=["critical"])
Non-repudiationCryptographic proof of actions
Permission request workflowsAuto-created when access denied

Observability & Fleet Management

FeatureHow
Automatic DAG visualizationWorkflow graphs in dashboard
Prometheus metrics/metrics out of the box
Structured JSON loggingAutomatic from SDK
Execution timelineChronological decision trace
Health checks (K8s-ready)/health, /ready endpoints
Correlation IDsX-Workflow-ID, X-Execution-ID
Workflow DAG APIGET /api/v1/workflows/{id}/dag
Agent heartbeat monitoringAuto health status transitions

Harness (Multi-turn Coding Agents)

FeatureHow
4 providersClaude Code, Codex, Gemini CLI, OpenCode
Schema-constrained outputschema=ResultModel (Pydantic/Zod)
Cost cappingmax_budget_usd=3.0
Turn limitingmax_turns=100
Tool access controltools=["Read", "Write", "Bash"]
Environment injectionenv={"KEY": "value"}
System prompt overridesystem_prompt="..."
Multi-layer output recoveryCosmetic repair → retry → full retry

Connector API (Fleet Management)

FeatureHow
Remote agent management/connector/reasoners
Version traffic control/connector/.../weight
Bearer token authAGENTFIELD_CONNECTOR_TOKEN
Air-gapped deploymentOutbound WebSocket only

Developer Experience

FeatureHow
CLI scaffoldingaf init my-agent --defaults --language python|go|typescript
Local dev with dashboardaf serverhttp://localhost:8080
Hot reloadaf dev auto-detects changes
Auto-REST from decoratorsEvery @app.reasoner()POST /api/v1/execute/...
Python, Go, TypeScript SDKsNative patterns per language
MCP server integrationaf add --mcp --url <server>
Config storage APIPOST /api/v1/configs/:key - database-backed
Docker + Kubernetes readyStateless control plane, horizontal scaling

Explore all features in detail →

Built With AgentField

Autonomous Engineering Team
Autonomous Engineering Team
One API call spins up PM, architect, coders, QA, reviewers - hundreds of coordinated agents that plan, build, test, and ship.

View project →
Deep Research Engine
Deep Research Engine
Recursive research backend. Spawns parallel agents, evaluates quality, generates deeper agents, and recurses -10,000+ agents per query.

View project →
Reactive MongoDB Intelligence
Reactive MongoDB Intelligence
Atlas Triggers + agent reasoning. Documents arrive raw and leave enriched - risk scores, pattern detection, evidence chains.

View project →
Autonomous Security Audit
Autonomous Security Audit
250 coordinated agents trace every vulnerability source-to-sink and adversarially verify each finding. Confirmed exploits, not pattern flags.

View project →
CloudSecurity AF
CloudSecurity AF
AI-native cloud infrastructure security scanner that performs shift-left attack path analysis directly from IaC, prioritizing the most dangerous risk chains before deployment.

View project →

See all examples →

Built something with AgentField? Submit your project to be featured on the examples page.

See It In Action

AgentField Dashboard
Real-time workflow DAGs · Execution traces · Agent fleet management · Audit trails

Architecture

AgentField Architecture

The control plane is a stateless Go service. Agents connect from anywhere - your laptop, Docker, Kubernetes. They register capabilities, the control plane routes calls between them, tracks execution as DAGs, and enforces policies. Full architecture docs →

Is AgentField for you?

Yes if you’re building beyond chatbots or small multi-agent workflows. If your agents are making decisions inside backend systems like approving refunds, processing claims, coordinating research, or running code, and you need routing, async execution, tracing, and audit trails.

Not yet if you’re still in the chatbot or early workflow stage, tools like LangChain or CrewAI are a great fit to explore and iterate. When you start pushing toward larger, production-grade agent systems, that’s where we come in.

Learn More

Community

Discord Twitter

GitHub Issues · Documentation · Examples

License

Apache 2.0