agent-express

May 7, 2026 · View on GitHub

Minimalist middleware framework for building AI agents in TypeScript.

Documentation · Getting Started · API Reference

Why agent-express

Three concepts: Agent, Session, and Middleware. That's the entire framework.

Every backend developer knows use(). Agent Express applies the Express.js middleware pattern to AI agents. One (ctx, next) interface replaces the 15-20 concepts you'll find in alternatives. If you've built an Express app, you already know the mental model.

Quick Start

npm install agent-express

import { Agent, tools } from "agent-express"
import { z } from "zod"

const agent = new Agent({
  model: "anthropic/claude-sonnet-4-6",
  instructions: "You are a helpful assistant.",
})

agent.use(tools.function({
  name: "greet",
  description: "Greet someone by name",
  schema: z.object({ name: z.string() }),
  execute: async ({ name }) => `Hello, ${name}!`,
}))

const { text } = await agent.run({ input: "Greet Alice" })
console.log(text)

Features

Middleware architecture -- 5 onion hooks (agent, session, turn, model, tool), one (ctx, next) pattern
Built-in guards -- budget caps, input/output validation, timeouts, iteration limits, HITL approval
Observability -- structured logging, OpenTelemetry metrics and traces, token tracking, tool recording
12+ model providers -- any AI SDK provider via "provider/model" string (Anthropic, OpenAI, Google, Mistral, Groq, and more)
Model routing -- complexity-based model selection across providers
Memory management -- context window compaction with 5 strategies
Testing toolkit -- TestModel, FunctionModel, capture, record/replay, snapshots
MCP integration -- connect to MCP servers as tool sources
HTTP adapter -- SSE streaming out of the box
CLI -- agent-express dev with hot reload, agent-express test with CI output
Structured output -- Zod schema validation on model responses

Middleware Namespaces

Compose capabilities by stacking middleware:

import { Agent, guard, observe, model, memory, dev } from "agent-express"

const agent = new Agent({ model: "anthropic/claude-sonnet-4-6" })

agent
  .use(guard.budget({ limit: 1.00 }))
  .use(guard.approve({ approve: myApprovalFn }))
  .use(observe.usage())
  .use(model.retry())
  .use(memory.compaction({ maxTokens: 8192 }))
  .use(dev.console())

Namespace	Middleware	Description
`guard`	`budget`, `input`, `output`, `maxIterations`, `timeout`, `approve`, `piiRedact`, `rateLimit`	Safety, cost, and compliance
`observe`	`usage`, `tools`, `duration`, `log`, `metrics`, `traces`	Monitoring, metrics, and tracing
`search`	`file`, `web`	Document search (RAG) and web search
`model`	`retry`, `router`	LLM call management
`memory`	`compaction`, `store`	Context window and session persistence
`tools`	`function`, `mcp`	Tool registration
`dev`	`console`	Development utilities

Presets (separate packages):

Package	Preset	Description
`@agent-express/preset-support`	`supportBot()`	Production support bot with RAG, PII, escalation, tone

Writing Custom Middleware

A plain function passed to .use() becomes a turn hook:

agent.use(async (ctx, next) => {
  console.log(`Turn ${ctx.turnIndex}: ${ctx.input[0]?.content}`)
  await next()
  console.log(`Response: ${ctx.output}`)
})

For multiple hooks, return a Middleware object:

import type { Middleware } from "agent-express"

const analytics: Middleware = {
  name: "analytics",
  state: {
    "analytics:turns": { default: 0 },
    "analytics:cost": { default: 0, reducer: (prev, delta) => prev + delta },
  },
  turn: async (ctx, next) => {
    ctx.state["analytics:turns"] = ctx.turnIndex + 1
    await next()
  },
  model: async (ctx, next) => {
    const response = await next()
    ctx.state["analytics:cost"] = response.usage.inputTokens * 0.000003
    return response
  },
}

agent.use(analytics)

The 5 hooks form an onion — code before next() runs on the way in, code after runs on the way out:

agent → session → turn → model → [LLM call]
                       → tool  → [tool execution]

See built-in middleware for real-world examples: guard.budget, observe.usage, model.retry, memory.compaction

Sessions and Streaming

Multi-turn conversations with session state:

await agent.init()
const session = agent.session()

const r1 = await session.run({ input: "My name is Alice" })
const r2 = await session.run({ input: "What's my name?" })
// r2.text → "Your name is Alice"

await agent.dispose()

Streaming typed events as they happen:

for await (const event of agent.run({ input: "Hello" })) {
  if (event.type === "model:chunk") process.stdout.write(event.payload.text)
  if (event.type === "tool:call") console.log(`Calling ${event.payload.name}...`)
}

Same Event objects flow through the iterator and session.events, so the streaming view and the persistent log are the same source of truth.

Persistence

Sessions persist to SQLite, Redis, or Postgres via adapter packages. Crash mid-turn, restart, the next session run resumes from the event log:

import { memory } from "agent-express"
import { sqliteStore } from "@agent-express/session-sqlite"

agent.use(memory.store(sqliteStore({ path: ".agent-express/sessions.db" })))

See docs/design/event-log.md for the full substrate design.

Testing

Mock LLM calls in tests with agent-express/test:

import { TestModel, testAgent } from "agent-express/test"

const model = new TestModel([
  { text: "Hello! How can I help?" },
])

const result = await testAgent(agent, {
  model,
  input: "Hi",
})
expect(result.text).toContain("Hello")

Record and replay real API calls:

import { RecordModel, ReplayModel } from "agent-express/test"

// Record once (hits real API)
const record = new RecordModel("anthropic/claude-sonnet-4-6")
await agent.run({ input: "test", model: record })
record.save("tests/cassettes/greeting.json")

// Replay forever (no API calls)
const replay = ReplayModel.load("tests/cassettes/greeting.json")
await agent.run({ input: "test", model: replay })

Comparison

Feature	agent-express	Mastra	Vercel AI SDK	LangChain.js
Core concepts	3	15-20	5-8	30+
Extension model	Middleware `(ctx, next)`	Processors, Tools, Workflows	Hooks, Providers	Chains, Agents, Tools, Memory
Built-in testing	Yes	No	No	No
Cost control	`guard.budget()`	Manual	Manual	Manual
TypeScript	Strict, ESM only	TypeScript	TypeScript	TypeScript

Packages

Core:

agent-express       -- Agent, Session, middleware namespaces, errors
agent-express/http  -- createHandler() SSE adapter
agent-express/test  -- TestModel, FunctionModel, testAgent()

Presets:

Package	Description
`@agent-express/preset-support`	Production support bot (RAG, PII, tone, escalation, rate limiting)

Adapter packages:

Package	Description
`@agent-express/embed-openai`	OpenAI text-embedding-3-small
`@agent-express/embed-cohere`	Cohere embed-v3
`@agent-express/search-brave`	Brave Search API
`@agent-express/search-tavily`	Tavily Search API
`@agent-express/search-exa`	Exa semantic search
`@agent-express/search-llamaindex`	LlamaIndex.TS file ingestion + cosine similarity
`@agent-express/search-qdrant`	Qdrant vector DB retriever
`@agent-express/search-pinecone`	Pinecone vector DB retriever
`@agent-express/search-pgvector`	PostgreSQL pgvector retriever
`@agent-express/session-sqlite`	SQLite session store (better-sqlite3)
`@agent-express/session-redis`	Redis session store (ioredis)
`@agent-express/session-postgres`	PostgreSQL session store (pg)

CLI

npx create-agent-express                             # interactive wizard
npx create-agent-express --template support-bot      # template scaffold
npx agent-express dev [entry]                       # terminal chat + hot reload
npx agent-express test                              # agent test runner
npx agent-express test --ci                         # JUnit XML output for CI

Architecture & Design Docs

For new contributors, the recommended reading order:

Concept — what we're building, the agent session primitive, why middleware beats graphs, 7-framework comparison
Middleware Interface — the single Middleware interface and (ctx, next) onion pattern
Agent Loop — 5-level lifecycle (agent / session / turn / model / tool) and the model→tool→model loop
Event Log — v0.4 substrate: typed events, durability, SessionStore contract

Reference docs for individual subsystems:

Providers — "provider/model" resolution and security guards
Adapters — three adapter families (session / embed / search) and how to write a custom adapter
Observability — six observability middlewares and OpenTelemetry integration
Testing — testAgent, FunctionModel, recorder cassettes, real-request guard

See docs/design/ for the full index and docs/research/ for reverse-engineering notes on adjacent agent frameworks.