AI Memory Protocol

February 23, 2026 · View on GitHub

License Python CI

Versioned, graph-based persistent memory for AI coding agents — powered by Sphinx-Needs.

AI agents lose context between sessions. This protocol gives them a structured way to remember, recall, and evolve knowledge — with full Git history, typed entries, graph links, and machine-readable output.

Features

  • Typed memories — observations, decisions, facts, preferences, risks, goals, open questions
  • Graph links — relates, supports, depends, supersedes, contradicts, example_of
  • Tag-based discoverytopic:api, repo:backend, tier:core
  • Context-optimized output — brief / compact / context / JSON formats with body toggling
  • Stale detection — auto-expire, review reminders, staleness checks
  • Auto-scaling — RST files split at 50 entries, transparent to queries
  • Git-native — every memory is an RST directive, fully diffable and versioned
  • MCP server — expose memory as tools for Claude Desktop, VS Code Copilot, and other MCP clients
  • Build-as-guardianneeds_warnings quality gates enforce tagging, linking, and body quality at build time
  • CLI-first — 12 subcommands for full lifecycle management

Installation

git clone https://github.com/bburda/ai_memory_protocol.git
pipx install -e ai_memory_protocol/

# With MCP server support
pipx install -e 'ai_memory_protocol/[mcp]'

This installs the memory CLI command (and optionally memory-mcp-stdio) globally on PATH.

Quick Start

# 1. Create a memory workspace
memory init .memories --name "My Project" --install

# 2. Add your first memory
memory add fact "API runs on port 8080" \
  --tags "topic:api,repo:backend" \
  --confidence high \
  --body "Gateway listens on 0.0.0.0:8080 by default" \
  --rebuild

# 3. Search
memory recall api port
memory recall --tag topic:api --format brief

# 4. Get full details
memory get FACT_api_runs_on_port_8080

How It Works

RST files (memory/*.rst)          ← Human + AI editable, Git-tracked

    ▼ memory rebuild (sphinx-build)
needs.json (_build/html/needs.json)   ← Machine-readable index

    ▼ memory recall / get / list
Formatted output                  ← Optimized for LLM context windows

Memories are stored as Sphinx-Needs directives in RST files. A memory rebuild command runs Sphinx to produce needs.json — the single query layer for all search operations. This means memories are simultaneously human-readable documentation and machine-queryable data.

CLI Reference

memory init <dir>                       # Create a new workspace
memory add <type> "<title>" [options]   # Record a memory
memory recall [query] [--tag ...] [--format brief|compact|context|json]
memory get <ID>                         # Full details of one memory
memory related <ID> [--hops N]          # Graph walk from a memory
memory list [--type TYPE] [--status S]  # Browse all memories
memory update <ID> [--confidence ...] [--add-tags ...] [--body ...] [--title ...]
memory deprecate <ID> [--by NEW_ID]     # Mark as deprecated
memory tags [--prefix PREFIX]           # Discover tags in use
memory stale                            # Find expired/overdue memories
memory review                           # Show memories needing review
memory rebuild                          # Rebuild needs.json

Key flags for recall:

  • --format brief — ultra-compact, minimal tokens
  • --body — include body text (off by default)
  • --sort newest|oldest|confidence|updated
  • --limit N — cap results
  • --expand 0 — disable graph expansion
  • --stale — only expired/review-overdue

MCP Server

Expose memory tools to LLM clients via the Model Context Protocol.

Setup

Install with MCP extras:

pipx install -e 'ai_memory_protocol/[mcp]'

Claude Code

claude mcp add --transport stdio --env MEMORY_DIR=/path/to/.memories memory -- memory-mcp-stdio

Or add to .mcp.json in your project root (project scope):

{
  "mcpServers": {
    "memory": {
      "type": "stdio",
      "command": "memory-mcp-stdio",
      "env": {
        "MEMORY_DIR": "/path/to/.memories"
      }
    }
  }
}

VS Code (GitHub Copilot)

Add to .vscode/mcp.json:

{
  "servers": {
    "memory": {
      "command": "memory-mcp-stdio",
      "env": {
        "MEMORY_DIR": "${workspaceFolder}/.memories"
      }
    }
  }
}

Available MCP Tools

ToolDescription
memory_recallSearch memories by text/tags with formatting options
memory_getGet full details of a specific memory
memory_addRecord a new memory with tags and metadata
memory_updateUpdate content or metadata (title, body, status, confidence, tags, etc.)
memory_deprecateMark a memory as deprecated
memory_tagsList all tags with counts
memory_staleFind expired/overdue memories
memory_rebuildRebuild needs.json index

Memory Types

TypePrefixUse Case
memMEM_Observation, note, or finding
decDEC_Design or architectural decision
factFACT_Verified, stable knowledge
prefPREF_Coding style or convention
riskRISK_Uncertainty or assumption
goalGOAL_Objective or target
qQ_Open question needing resolution
LinkMeaning
relatesGeneral association
supportsEvidence or justification
dependsHard dependency
supersedesReplaces older memory
contradictsConflict or tension
example_ofConcrete instance of concept

Metadata

FieldValuesPurpose
confidencelow / medium / highTrust level
scopeglobal, repo:X, product:XApplicability
tagsprefix:value formatCategorization
sourceURL, commit, descriptionProvenance
review_afterISO dateStaleness trigger
expires_atISO dateAuto-expire date
created_atISO dateCapture timestamp

Tagging Conventions

Tags use prefix:value format for consistent discovery:

  • topic: — Subject area (topic:gateway, topic:auth)
  • repo: — Repository (repo:backend, repo:web-ui)
  • domain: — Knowledge domain (domain:robotics, domain:web)
  • tier: — Importance level (tier:core, tier:detail)
  • intent: — Purpose (intent:decision, intent:coding-style)

AI Agent Integration

1. READ — Peek then Drill (two-phase recall)

Always use a two-phase approach. Never go straight to body text on broad queries.

Phase A — Peek (scan titles, zero body text):

memory recall --tag topic:gateway --format brief --expand 0

Returns [ID] Title (confidence) one-liners. Minimal tokens. Do this FIRST.

Phase B — Drill (read full body of specific memories):

memory get DEC_handler_context_pattern

Only after peeking — pick the 2-3 most relevant IDs and get them individually.

When to recall — recall is NOT just a session-start ritual. Recall at each of these moments:

TriggerWhat to recall
Session startrecall --format brief --limit 20 --sort newest
New task or topicrecall --tag topic:<X> --format brief
Entering unfamiliar coderecall --tag repo:<X> --type fact --format brief
Before a design decisionrecall --tag topic:<X> --type dec
Encountering an error or failurerecall <error message keywords> — FIRST reaction before debugging; check if this problem was already solved
Stuck after initial attemptsrecall --tag topic:<X> --type mem,fact — broaden search to related areas and past solutions
Before implementing a patternrecall --tag intent:coding-style --type pref

2. WRITE — Record at specific trigger points

Recording memories is NOT optional. Write at these concrete moments:

TriggerTypeExample
Chose approach A over Bdec"Use tl::expected over exceptions"
Fixed a non-obvious bugmem"EntityCache race condition fix"
Discovered undocumented APIfact"Routes match in registration order"
User stated a preferencepref"Prefer Zustand over Redux"
Identified a riskrisk"JWT secret hardcoded in tests"
Question remains unansweredq"Should synthetic components expose operations?"

End-of-task writes: summarize architecture learned (fact), record conventions (pref), note anything a future agent needs (mem), capture unfinished goals (goal).

Write quality rules:

  • --tags is mandatory — without tags, the memory is unfindable
  • --body must be self-contained with file paths and concrete details
  • Use --rebuild flag to make new memories immediately searchable

3. SUPERSEDE, don't edit

When knowledge changes, add a new entry with --supersedes OLD_ID and deprecate the old one.

4. CHECK STALENESS periodically

Run memory stale at the start of long sessions to keep the graph accurate.

Context Window Optimization

  • recall omits body by default — this is intentional, not a limitation
  • Peek with --format briefdrill with get <ID> — this is the core pattern
  • Use --limit 10 and --expand 0 when exploring broad topics
  • Use --tag filters to narrow results instead of free-text
  • Use memory tags to discover available tag prefixes before filtering

Project Structure

ai_memory_protocol/
├── pyproject.toml           # Package definition, CLI + MCP entry points
├── README.md
├── LICENSE                  # Apache 2.0
├── CONTRIBUTING.md
├── .pre-commit-config.yaml
├── .github/workflows/ci.yml
└── src/
    └── ai_memory_protocol/
        ├── __init__.py
        ├── cli.py           # CLI (argparse, 12 subcommands)
        ├── mcp_server.py    # MCP server (8 tools, stdio transport)
        ├── config.py        # Type definitions, constants
        ├── engine.py        # Workspace detection, search, graph walk
        ├── formatter.py     # Output formatting (brief/compact/context/json)
        ├── rst.py           # RST generation, editing, file splitting
        └── scaffold.py      # Workspace scaffolding (init command)

Memory data lives in a separate workspace (e.g., .memories/), created with memory init.

Build-as-Guardian

The Sphinx build acts as a quality gate for the memory graph. needs_warnings in conf.py define constraints that fire during memory rebuild:

needs_warnings = {
    "missing_topic_tag": "type in ['mem','dec','fact',...] and not any(t.startswith('topic:') for t in tags)",
    "empty_body": "description == '' or description == 'TODO: Add description.'",
    "deprecated_without_supersede": "status == 'deprecated' and len(supersedes_back) == 0",
}

With sphinx-build -W (warnings as errors), the build fails if any memory violates these constraints. This means:

  • Every memory must have at least one topic: tag
  • No empty placeholders survive to the index
  • Deprecated memories must be superseded by a replacement

Agents learn to self-correct: if rebuild fails, they read the warning, fix the offending memory, and retry.

Human Role

Humans are observers and editors, not gatekeepers:

  • Dashboardsmemory/dashboards.rst contains needtable, needlist, and needflow directives rendering the live state of the memory graph as HTML
  • RST editing — memories are plain RST, editable in any text editor or IDE with full diff/blame in Git
  • Override — humans can update status, confidence, or tags on any memory via CLI or direct RST edit
  • Reviewmemory review surfaces memories whose review_after date has passed, prompting human validation

The protocol is designed so that agents maintain knowledge autonomously while humans retain full visibility and override capability.

Contributing

See CONTRIBUTING.md for guidelines on how to contribute.

License

Apache 2.0