README.md
June 29, 2026 · View on GitHub
a recursive self-improving harness designed to help your agents (and future iterations of those agents) succeed on any task
autocontext is a harness for agent improvement. Give it a goal, it runs the task against evaluation, keeps the useful lessons, discards dead ends, and leaves traces, reports, playbooks, datasets, and optional local-model training artifacts for the next run.
Docs: autocontext.ai/docs · quickstart · CLI reference · changelog
Install
| Surface | Command |
|---|---|
| Python CLI | uv tool install autocontext==0.10.0 |
| Python library/dev | uv pip install autocontext==0.10.0 |
| TypeScript/Node CLI | bun add -g autoctx@0.10.0 |
| Pi extension | pi install npm:pi-autocontext@0.8.0 |
The PyPI package is autocontext; the CLI is autoctx. The npm package is autoctx (not the unrelated autocontext npm package). Provider variables live in .env.example.
30-Second Run
Pi is the lowest-friction provider because it uses your local agent auth:
AUTOCONTEXT_AGENT_PROVIDER=pi \
AUTOCONTEXT_PI_COMMAND=pi \
autoctx solve "improve customer-support replies for billing disputes" --iterations 3
Use AUTOCONTEXT_AGENT_PROVIDER=anthropic, openai-compatible, claude-cli, codex, pi-rpc, or another provider when you need that runtime. See agent integration for the full matrix.
Agent Entry Points
- Pi: install
pi-autocontext, then ask Pi to solve, judge, improve, list, or inspect runs through the packaged skill. - MCP clients: run
autoctx mcp-serveorbunx autoctx mcp-serveand expose the tools to Claude Code, Cursor, or another MCP client. - Hermes: export the CLI-first skill with
uv run autoctx hermes export-skill --with-references --json.
Full setup: autocontext/docs/agent-integration.md.
What A Run Leaves Behind
runs/<run_id>/
├── trace.jsonl
├── generations/<n>/{strategy.json,analysis.md,score.json}
├── report.md
└── artifacts/
knowledge/<scenario>/
├── playbook.md
├── hints.md
└── tools/
Everything is filesystem-first: inspect it, diff it, replay it, export it, or feed it into training.
Core Surfaces
| Surface | Command | Use it for |
|---|---|---|
solve | autoctx solve "..." --iterations 3 | Start from a plain-language goal |
run | autoctx run <scenario> --iterations 3 | Improve a saved scenario |
simulate | autoctx simulate -d "..." | Model/replay/compare system behavior |
investigate | autoctx investigate -d "..." | Evidence-driven diagnosis |
mission | autoctx mission create --name "..." --goal "..." | Verifier-driven multi-step goals |
train | uv run autoctx train --scenario <name> --data <jsonl> | Distill stable behavior into a cheaper runtime (Python) |
mcp-serve | autoctx mcp-serve | Give an agent the autocontext tool surface |
Python owns the full control-plane package; TypeScript owns several operator-facing surfaces, the TUI, and Node runtime adapters. Start with autocontext/README.md or ts/README.md.
What's New in 0.10.0
- Scaled training plans add default-off CUDA/TRL profiles for 7B QLoRA RLVR and sharded 32B/72B distillation across Python and TypeScript.
- Training scale metadata records device count, sharding, memory budgets, quantization, parameter count, and deployment VRAM for registry gating.
- TypeScript CLI parity makes
--scale-profilepreserve profile backend/base/mode and documents every scale-related train flag.
Scenario Families
The shipped families cover games, agent tasks, simulations, artifact editing, investigations, workflows, negotiation, schema evolution, tool fragility, operator loops, and coordination. Python and TypeScript share the family vocabulary; see docs/scenario-parity-matrix.md for parity details.
Package Guides
| Need | Go here |
|---|---|
| Python CLI/library, MCP, HTTP, training | autocontext/README.md |
| Node CLI, TUI, missions, Fetch/agent adapters | ts/README.md |
| Pi package | pi/README.md |
| Copy-paste examples | examples/README.md |
| Concepts and docs index | docs/README.md |
| Contributor setup | CONTRIBUTING.md |
| Repo guide for agents | AGENTS.md |
Project Signals
Acknowledgments
Thanks to George for generously donating the autocontext name on PyPI.