README.md

June 29, 2026 · View on GitHub

autocontext logo

a recursive self-improving harness designed to help your agents (and future iterations of those agents) succeed on any task

autocontext is a harness for agent improvement. Give it a goal, it runs the task against evaluation, keeps the useful lessons, discards dead ends, and leaves traces, reports, playbooks, datasets, and optional local-model training artifacts for the next run.

Docs: autocontext.ai/docs · quickstart · CLI reference · changelog

Install

Surface	Command
Python CLI	`uv tool install autocontext==0.10.0`
Python library/dev	`uv pip install autocontext==0.10.0`
TypeScript/Node CLI	`bun add -g autoctx@0.10.0`
Pi extension	`pi install npm:pi-autocontext@0.8.0`

The PyPI package is autocontext; the CLI is autoctx. The npm package is autoctx (not the unrelated autocontext npm package). Provider variables live in .env.example.

30-Second Run

Pi is the lowest-friction provider because it uses your local agent auth:

AUTOCONTEXT_AGENT_PROVIDER=pi \
AUTOCONTEXT_PI_COMMAND=pi \
autoctx solve "improve customer-support replies for billing disputes" --iterations 3

Use AUTOCONTEXT_AGENT_PROVIDER=anthropic, openai-compatible, claude-cli, codex, pi-rpc, or another provider when you need that runtime. See agent integration for the full matrix.

Agent Entry Points

Pi: install pi-autocontext, then ask Pi to solve, judge, improve, list, or inspect runs through the packaged skill.
MCP clients: run autoctx mcp-serve or bunx autoctx mcp-serve and expose the tools to Claude Code, Cursor, or another MCP client.
Hermes: export the CLI-first skill with uv run autoctx hermes export-skill --with-references --json.

Full setup: autocontext/docs/agent-integration.md.

What A Run Leaves Behind

runs/<run_id>/
├── trace.jsonl
├── generations/<n>/{strategy.json,analysis.md,score.json}
├── report.md
└── artifacts/

knowledge/<scenario>/
├── playbook.md
├── hints.md
└── tools/

Everything is filesystem-first: inspect it, diff it, replay it, export it, or feed it into training.

Core Surfaces

Surface	Command	Use it for
`solve`	`autoctx solve "..." --iterations 3`	Start from a plain-language goal
`run`	`autoctx run <scenario> --iterations 3`	Improve a saved scenario
`simulate`	`autoctx simulate -d "..."`	Model/replay/compare system behavior
`investigate`	`autoctx investigate -d "..."`	Evidence-driven diagnosis
`mission`	`autoctx mission create --name "..." --goal "..."`	Verifier-driven multi-step goals
`train`	`uv run autoctx train --scenario <name> --data <jsonl>`	Distill stable behavior into a cheaper runtime (Python)
`mcp-serve`	`autoctx mcp-serve`	Give an agent the autocontext tool surface

Python owns the full control-plane package; TypeScript owns several operator-facing surfaces, the TUI, and Node runtime adapters. Start with autocontext/README.md or ts/README.md.

What's New in 0.10.0

Scaled training plans add default-off CUDA/TRL profiles for 7B QLoRA RLVR and sharded 32B/72B distillation across Python and TypeScript.
Training scale metadata records device count, sharding, memory budgets, quantization, parameter count, and deployment VRAM for registry gating.
TypeScript CLI parity makes --scale-profile preserve profile backend/base/mode and documents every scale-related train flag.

Scenario Families

The shipped families cover games, agent tasks, simulations, artifact editing, investigations, workflows, negotiation, schema evolution, tool fragility, operator loops, and coordination. Python and TypeScript share the family vocabulary; see docs/scenario-parity-matrix.md for parity details.

Package Guides

Need	Go here
Python CLI/library, MCP, HTTP, training	autocontext/README.md
Node CLI, TUI, missions, Fetch/agent adapters	ts/README.md
Pi package	pi/README.md
Copy-paste examples	examples/README.md
Concepts and docs index	docs/README.md
Contributor setup	CONTRIBUTING.md
Repo guide for agents	AGENTS.md

Project Signals

Acknowledgments

Thanks to George for generously donating the autocontext name on PyPI.