AI Memory Protocol
February 23, 2026 · View on GitHub
Versioned, graph-based persistent memory for AI coding agents — powered by Sphinx-Needs.
AI agents lose context between sessions. This protocol gives them a structured way to remember, recall, and evolve knowledge — with full Git history, typed entries, graph links, and machine-readable output.
Features
- Typed memories — observations, decisions, facts, preferences, risks, goals, open questions
- Graph links — relates, supports, depends, supersedes, contradicts, example_of
- Tag-based discovery —
topic:api,repo:backend,tier:core - Context-optimized output — brief / compact / context / JSON formats with body toggling
- Stale detection — auto-expire, review reminders, staleness checks
- Auto-scaling — RST files split at 50 entries, transparent to queries
- Git-native — every memory is an RST directive, fully diffable and versioned
- MCP server — expose memory as tools for Claude Desktop, VS Code Copilot, and other MCP clients
- Build-as-guardian —
needs_warningsquality gates enforce tagging, linking, and body quality at build time - CLI-first — 12 subcommands for full lifecycle management
Installation
git clone https://github.com/bburda/ai_memory_protocol.git
pipx install -e ai_memory_protocol/
# With MCP server support
pipx install -e 'ai_memory_protocol/[mcp]'
This installs the memory CLI command (and optionally memory-mcp-stdio) globally on PATH.
Quick Start
# 1. Create a memory workspace
memory init .memories --name "My Project" --install
# 2. Add your first memory
memory add fact "API runs on port 8080" \
--tags "topic:api,repo:backend" \
--confidence high \
--body "Gateway listens on 0.0.0.0:8080 by default" \
--rebuild
# 3. Search
memory recall api port
memory recall --tag topic:api --format brief
# 4. Get full details
memory get FACT_api_runs_on_port_8080
How It Works
RST files (memory/*.rst) ← Human + AI editable, Git-tracked
│
▼ memory rebuild (sphinx-build)
needs.json (_build/html/needs.json) ← Machine-readable index
│
▼ memory recall / get / list
Formatted output ← Optimized for LLM context windows
Memories are stored as Sphinx-Needs directives in RST files. A memory rebuild command runs Sphinx to produce needs.json — the single query layer for all search operations. This means memories are simultaneously human-readable documentation and machine-queryable data.
CLI Reference
memory init <dir> # Create a new workspace
memory add <type> "<title>" [options] # Record a memory
memory recall [query] [--tag ...] [--format brief|compact|context|json]
memory get <ID> # Full details of one memory
memory related <ID> [--hops N] # Graph walk from a memory
memory list [--type TYPE] [--status S] # Browse all memories
memory update <ID> [--confidence ...] [--add-tags ...] [--body ...] [--title ...]
memory deprecate <ID> [--by NEW_ID] # Mark as deprecated
memory tags [--prefix PREFIX] # Discover tags in use
memory stale # Find expired/overdue memories
memory review # Show memories needing review
memory rebuild # Rebuild needs.json
Key flags for recall:
--format brief— ultra-compact, minimal tokens--body— include body text (off by default)--sort newest|oldest|confidence|updated--limit N— cap results--expand 0— disable graph expansion--stale— only expired/review-overdue
MCP Server
Expose memory tools to LLM clients via the Model Context Protocol.
Setup
Install with MCP extras:
pipx install -e 'ai_memory_protocol/[mcp]'
Claude Code
claude mcp add --transport stdio --env MEMORY_DIR=/path/to/.memories memory -- memory-mcp-stdio
Or add to .mcp.json in your project root (project scope):
{
"mcpServers": {
"memory": {
"type": "stdio",
"command": "memory-mcp-stdio",
"env": {
"MEMORY_DIR": "/path/to/.memories"
}
}
}
}
VS Code (GitHub Copilot)
Add to .vscode/mcp.json:
{
"servers": {
"memory": {
"command": "memory-mcp-stdio",
"env": {
"MEMORY_DIR": "${workspaceFolder}/.memories"
}
}
}
}
Available MCP Tools
| Tool | Description |
|---|---|
memory_recall | Search memories by text/tags with formatting options |
memory_get | Get full details of a specific memory |
memory_add | Record a new memory with tags and metadata |
memory_update | Update content or metadata (title, body, status, confidence, tags, etc.) |
memory_deprecate | Mark a memory as deprecated |
memory_tags | List all tags with counts |
memory_stale | Find expired/overdue memories |
memory_rebuild | Rebuild needs.json index |
Memory Types
| Type | Prefix | Use Case |
|---|---|---|
mem | MEM_ | Observation, note, or finding |
dec | DEC_ | Design or architectural decision |
fact | FACT_ | Verified, stable knowledge |
pref | PREF_ | Coding style or convention |
risk | RISK_ | Uncertainty or assumption |
goal | GOAL_ | Objective or target |
q | Q_ | Open question needing resolution |
Graph Links
| Link | Meaning |
|---|---|
relates | General association |
supports | Evidence or justification |
depends | Hard dependency |
supersedes | Replaces older memory |
contradicts | Conflict or tension |
example_of | Concrete instance of concept |
Metadata
| Field | Values | Purpose |
|---|---|---|
confidence | low / medium / high | Trust level |
scope | global, repo:X, product:X | Applicability |
tags | prefix:value format | Categorization |
source | URL, commit, description | Provenance |
review_after | ISO date | Staleness trigger |
expires_at | ISO date | Auto-expire date |
created_at | ISO date | Capture timestamp |
Tagging Conventions
Tags use prefix:value format for consistent discovery:
topic:— Subject area (topic:gateway,topic:auth)repo:— Repository (repo:backend,repo:web-ui)domain:— Knowledge domain (domain:robotics,domain:web)tier:— Importance level (tier:core,tier:detail)intent:— Purpose (intent:decision,intent:coding-style)
AI Agent Integration
Recommended Workflow
1. READ — Peek then Drill (two-phase recall)
Always use a two-phase approach. Never go straight to body text on broad queries.
Phase A — Peek (scan titles, zero body text):
memory recall --tag topic:gateway --format brief --expand 0
Returns [ID] Title (confidence) one-liners. Minimal tokens. Do this FIRST.
Phase B — Drill (read full body of specific memories):
memory get DEC_handler_context_pattern
Only after peeking — pick the 2-3 most relevant IDs and get them individually.
When to recall — recall is NOT just a session-start ritual. Recall at each of these moments:
| Trigger | What to recall |
|---|---|
| Session start | recall --format brief --limit 20 --sort newest |
| New task or topic | recall --tag topic:<X> --format brief |
| Entering unfamiliar code | recall --tag repo:<X> --type fact --format brief |
| Before a design decision | recall --tag topic:<X> --type dec |
| Encountering an error or failure | recall <error message keywords> — FIRST reaction before debugging; check if this problem was already solved |
| Stuck after initial attempts | recall --tag topic:<X> --type mem,fact — broaden search to related areas and past solutions |
| Before implementing a pattern | recall --tag intent:coding-style --type pref |
2. WRITE — Record at specific trigger points
Recording memories is NOT optional. Write at these concrete moments:
| Trigger | Type | Example |
|---|---|---|
| Chose approach A over B | dec | "Use tl::expected over exceptions" |
| Fixed a non-obvious bug | mem | "EntityCache race condition fix" |
| Discovered undocumented API | fact | "Routes match in registration order" |
| User stated a preference | pref | "Prefer Zustand over Redux" |
| Identified a risk | risk | "JWT secret hardcoded in tests" |
| Question remains unanswered | q | "Should synthetic components expose operations?" |
End-of-task writes: summarize architecture learned (fact), record conventions (pref), note anything a future agent needs (mem), capture unfinished goals (goal).
Write quality rules:
--tagsis mandatory — without tags, the memory is unfindable--bodymust be self-contained with file paths and concrete details- Use
--rebuildflag to make new memories immediately searchable
3. SUPERSEDE, don't edit
When knowledge changes, add a new entry with --supersedes OLD_ID and deprecate the old one.
4. CHECK STALENESS periodically
Run memory stale at the start of long sessions to keep the graph accurate.
Context Window Optimization
recallomits body by default — this is intentional, not a limitation- Peek with
--format brief→ drill withget <ID>— this is the core pattern - Use
--limit 10and--expand 0when exploring broad topics - Use
--tagfilters to narrow results instead of free-text - Use
memory tagsto discover available tag prefixes before filtering
Project Structure
ai_memory_protocol/
├── pyproject.toml # Package definition, CLI + MCP entry points
├── README.md
├── LICENSE # Apache 2.0
├── CONTRIBUTING.md
├── .pre-commit-config.yaml
├── .github/workflows/ci.yml
└── src/
└── ai_memory_protocol/
├── __init__.py
├── cli.py # CLI (argparse, 12 subcommands)
├── mcp_server.py # MCP server (8 tools, stdio transport)
├── config.py # Type definitions, constants
├── engine.py # Workspace detection, search, graph walk
├── formatter.py # Output formatting (brief/compact/context/json)
├── rst.py # RST generation, editing, file splitting
└── scaffold.py # Workspace scaffolding (init command)
Memory data lives in a separate workspace (e.g., .memories/), created with memory init.
Build-as-Guardian
The Sphinx build acts as a quality gate for the memory graph. needs_warnings in conf.py define constraints that fire during memory rebuild:
needs_warnings = {
"missing_topic_tag": "type in ['mem','dec','fact',...] and not any(t.startswith('topic:') for t in tags)",
"empty_body": "description == '' or description == 'TODO: Add description.'",
"deprecated_without_supersede": "status == 'deprecated' and len(supersedes_back) == 0",
}
With sphinx-build -W (warnings as errors), the build fails if any memory violates these constraints. This means:
- Every memory must have at least one
topic:tag - No empty placeholders survive to the index
- Deprecated memories must be superseded by a replacement
Agents learn to self-correct: if rebuild fails, they read the warning, fix the offending memory, and retry.
Human Role
Humans are observers and editors, not gatekeepers:
- Dashboards —
memory/dashboards.rstcontainsneedtable,needlist, andneedflowdirectives rendering the live state of the memory graph as HTML - RST editing — memories are plain RST, editable in any text editor or IDE with full diff/blame in Git
- Override — humans can update status, confidence, or tags on any memory via CLI or direct RST edit
- Review —
memory reviewsurfaces memories whosereview_afterdate has passed, prompting human validation
The protocol is designed so that agents maintain knowledge autonomously while humans retain full visibility and override capability.
Contributing
See CONTRIBUTING.md for guidelines on how to contribute.