agent-express

May 7, 2026 · View on GitHub

Minimalist middleware framework for building AI agents in TypeScript.

npm version CI Coverage License TypeScript

Documentation · Getting Started · API Reference

Why agent-express

Three concepts: Agent, Session, and Middleware. That's the entire framework.

Every backend developer knows use(). Agent Express applies the Express.js middleware pattern to AI agents. One (ctx, next) interface replaces the 15-20 concepts you'll find in alternatives. If you've built an Express app, you already know the mental model.

Quick Start

npm install agent-express
import { Agent, tools } from "agent-express"
import { z } from "zod"

const agent = new Agent({
  model: "anthropic/claude-sonnet-4-6",
  instructions: "You are a helpful assistant.",
})

agent.use(tools.function({
  name: "greet",
  description: "Greet someone by name",
  schema: z.object({ name: z.string() }),
  execute: async ({ name }) => `Hello, ${name}!`,
}))

const { text } = await agent.run({ input: "Greet Alice" })
console.log(text)

Features

  • Middleware architecture -- 5 onion hooks (agent, session, turn, model, tool), one (ctx, next) pattern
  • Built-in guards -- budget caps, input/output validation, timeouts, iteration limits, HITL approval
  • Observability -- structured logging, OpenTelemetry metrics and traces, token tracking, tool recording
  • 12+ model providers -- any AI SDK provider via "provider/model" string (Anthropic, OpenAI, Google, Mistral, Groq, and more)
  • Model routing -- complexity-based model selection across providers
  • Memory management -- context window compaction with 5 strategies
  • Testing toolkit -- TestModel, FunctionModel, capture, record/replay, snapshots
  • MCP integration -- connect to MCP servers as tool sources
  • HTTP adapter -- SSE streaming out of the box
  • CLI -- agent-express dev with hot reload, agent-express test with CI output
  • Structured output -- Zod schema validation on model responses

Middleware Namespaces

Compose capabilities by stacking middleware:

import { Agent, guard, observe, model, memory, dev } from "agent-express"

const agent = new Agent({ model: "anthropic/claude-sonnet-4-6" })

agent
  .use(guard.budget({ limit: 1.00 }))
  .use(guard.approve({ approve: myApprovalFn }))
  .use(observe.usage())
  .use(model.retry())
  .use(memory.compaction({ maxTokens: 8192 }))
  .use(dev.console())
NamespaceMiddlewareDescription
guardbudget, input, output, maxIterations, timeout, approve, piiRedact, rateLimitSafety, cost, and compliance
observeusage, tools, duration, log, metrics, tracesMonitoring, metrics, and tracing
searchfile, webDocument search (RAG) and web search
modelretry, routerLLM call management
memorycompaction, storeContext window and session persistence
toolsfunction, mcpTool registration
devconsoleDevelopment utilities

Presets (separate packages):

PackagePresetDescription
@agent-express/preset-supportsupportBot()Production support bot with RAG, PII, escalation, tone

Writing Custom Middleware

A plain function passed to .use() becomes a turn hook:

agent.use(async (ctx, next) => {
  console.log(`Turn ${ctx.turnIndex}: ${ctx.input[0]?.content}`)
  await next()
  console.log(`Response: ${ctx.output}`)
})

For multiple hooks, return a Middleware object:

import type { Middleware } from "agent-express"

const analytics: Middleware = {
  name: "analytics",
  state: {
    "analytics:turns": { default: 0 },
    "analytics:cost": { default: 0, reducer: (prev, delta) => prev + delta },
  },
  turn: async (ctx, next) => {
    ctx.state["analytics:turns"] = ctx.turnIndex + 1
    await next()
  },
  model: async (ctx, next) => {
    const response = await next()
    ctx.state["analytics:cost"] = response.usage.inputTokens * 0.000003
    return response
  },
}

agent.use(analytics)

The 5 hooks form an onion — code before next() runs on the way in, code after runs on the way out:

agent → session → turn → model → [LLM call]
                       → tool  → [tool execution]

See built-in middleware for real-world examples: guard.budget, observe.usage, model.retry, memory.compaction

Sessions and Streaming

Multi-turn conversations with session state:

await agent.init()
const session = agent.session()

const r1 = await session.run({ input: "My name is Alice" })
const r2 = await session.run({ input: "What's my name?" })
// r2.text → "Your name is Alice"

await agent.dispose()

Streaming typed events as they happen:

for await (const event of agent.run({ input: "Hello" })) {
  if (event.type === "model:chunk") process.stdout.write(event.payload.text)
  if (event.type === "tool:call") console.log(`Calling ${event.payload.name}...`)
}

Same Event objects flow through the iterator and session.events, so the streaming view and the persistent log are the same source of truth.

Persistence

Sessions persist to SQLite, Redis, or Postgres via adapter packages. Crash mid-turn, restart, the next session run resumes from the event log:

import { memory } from "agent-express"
import { sqliteStore } from "@agent-express/session-sqlite"

agent.use(memory.store(sqliteStore({ path: ".agent-express/sessions.db" })))

See docs/design/event-log.md for the full substrate design.

Testing

Mock LLM calls in tests with agent-express/test:

import { TestModel, testAgent } from "agent-express/test"

const model = new TestModel([
  { text: "Hello! How can I help?" },
])

const result = await testAgent(agent, {
  model,
  input: "Hi",
})
expect(result.text).toContain("Hello")

Record and replay real API calls:

import { RecordModel, ReplayModel } from "agent-express/test"

// Record once (hits real API)
const record = new RecordModel("anthropic/claude-sonnet-4-6")
await agent.run({ input: "test", model: record })
record.save("tests/cassettes/greeting.json")

// Replay forever (no API calls)
const replay = ReplayModel.load("tests/cassettes/greeting.json")
await agent.run({ input: "test", model: replay })

Comparison

Featureagent-expressMastraVercel AI SDKLangChain.js
Core concepts315-205-830+
Extension modelMiddleware (ctx, next)Processors, Tools, WorkflowsHooks, ProvidersChains, Agents, Tools, Memory
Built-in testingYesNoNoNo
Cost controlguard.budget()ManualManualManual
TypeScriptStrict, ESM onlyTypeScriptTypeScriptTypeScript

Packages

Core:

agent-express       -- Agent, Session, middleware namespaces, errors
agent-express/http  -- createHandler() SSE adapter
agent-express/test  -- TestModel, FunctionModel, testAgent()

Presets:

PackageDescription
@agent-express/preset-supportProduction support bot (RAG, PII, tone, escalation, rate limiting)

Adapter packages:

PackageDescription
@agent-express/embed-openaiOpenAI text-embedding-3-small
@agent-express/embed-cohereCohere embed-v3
@agent-express/search-braveBrave Search API
@agent-express/search-tavilyTavily Search API
@agent-express/search-exaExa semantic search
@agent-express/search-llamaindexLlamaIndex.TS file ingestion + cosine similarity
@agent-express/search-qdrantQdrant vector DB retriever
@agent-express/search-pineconePinecone vector DB retriever
@agent-express/search-pgvectorPostgreSQL pgvector retriever
@agent-express/session-sqliteSQLite session store (better-sqlite3)
@agent-express/session-redisRedis session store (ioredis)
@agent-express/session-postgresPostgreSQL session store (pg)

CLI

npx create-agent-express                             # interactive wizard
npx create-agent-express --template support-bot      # template scaffold
npx agent-express dev [entry]                       # terminal chat + hot reload
npx agent-express test                              # agent test runner
npx agent-express test --ci                         # JUnit XML output for CI

Architecture & Design Docs

For new contributors, the recommended reading order:

  1. Concept — what we're building, the agent session primitive, why middleware beats graphs, 7-framework comparison
  2. Middleware Interface — the single Middleware interface and (ctx, next) onion pattern
  3. Agent Loop — 5-level lifecycle (agent / session / turn / model / tool) and the model→tool→model loop
  4. Event Log — v0.4 substrate: typed events, durability, SessionStore contract

Reference docs for individual subsystems:

  • Providers"provider/model" resolution and security guards
  • Adapters — three adapter families (session / embed / search) and how to write a custom adapter
  • Observability — six observability middlewares and OpenTelemetry integration
  • TestingtestAgent, FunctionModel, recorder cassettes, real-request guard

See docs/design/ for the full index and docs/research/ for reverse-engineering notes on adjacent agent frameworks.