Context Lifecycle Contract

June 27, 2026 · View on GitHub

Publicly, AgentOps sells bookkeeping, validation, primitives, and flows. This page explains the internal proof gaps those promises have to close: validation risk, bookkeeping decay, and loop closure.

Internal Proof Contract

Most coding-agent tooling is strong at prompt construction and agent routing. The failure mode comes after that:

  1. Validation is missing. Internally, this gap is tracked as judgment validation: the agent chooses an approach without loading the risk context that would challenge it before or after implementation.
  2. Bookkeeping is missing. Internally, this gap is tracked as durable learning: solved problems come back as if they were never solved.
  3. Flows are present, but they do not compound. Internally, this gap is tracked as loop closure: completed work does not reliably produce better next work, better rules, or better future context.

AgentOps treats those three gaps as one lifecycle contract. Skills and CLI primitives are the operator surface; the proof obligation is that each cycle actually closes these gaps.

Gap 1: Validation

Problem. Compile/test checks are not enough. An agent can ship the happy path while missing architecture fit, edge cases, or risk context.

Observable symptoms:

  • A plan looks coherent but silently picks the wrong middleware stack, abstraction, or integration point
  • The implementation passes basic checks but fails on error paths, compatibility edges, or workflow constraints
  • Validation happens only after the work is already expensive to unwind

AgentOps mechanisms:

MechanismSourceRole
/pre-mortemskills/pre-mortem/SKILL.mdLoads plan-review validation before code exists
/validateskills/validate/SKILL.mdRuns post-implementation validation instead of stopping at build/test (absorbs the retired /vibe)
/councilskills/council/SKILL.mdSupplies multi-judge review for plans and code
Pre-mortem gateCI gate (.github/workflows/validate.yml) + the /pre-mortem skillPrevents large implementation work from skipping plan validation
Task-validation constraint check/validate skill + ao constraint reading .agents/constraints/index.jsonTask-validation executes active compiled constraints for mechanically detectable findings
Product-aware review contextPRODUCT.mdInjects product and DX perspectives into validation flows

Supporting failure modes addressed inside this gap:

  • context contamination inside long sessions
  • architecture drift from choosing the wrong existing pattern
  • review culture that depends on a human noticing problems after the fact

Gap 2: Bookkeeping

Problem. Notes are not learning. If solved work is not extracted, scored, retrieved, and re-used, the same repo keeps paying for the same lesson.

Observable symptoms:

  • An auth bug fixed on Monday comes back on Wednesday
  • The agent re-runs the same dead-end investigation in a new session
  • The repo accumulates artifacts, but not reusable bookkeeping

AgentOps mechanisms:

MechanismSourceRole
.agents/ ledgerKnowledge LedgerStores plans, learnings, patterns, council outputs, and next-work artifacts on disk
Finding registrydocs/contracts/finding-registry.mdStores reusable structured findings that planning and validation can load before rediscovering the same failure
ao lookup / injectionKnowledge Ledger and ao CLIRetrieves repo-specific context at session start and task boundaries
/retro and /post-mortem extractionskills/post-mortem/SKILL.mdTurns completed work into reusable learnings and patterns
Freshness / maturity controlsao maturity, ao dedup, ao contradictKeeps retrieval focused on useful, current knowledge
Compile cycleGOALS.md directive 5Mines missed signal, defrags stale knowledge, and flags oscillation

Supporting failure modes addressed inside this gap:

  • session amnesia between independent runs
  • stale or contradictory learnings swamping retrieval
  • bookkeeping systems that store notes without curation or reinforcement

Gap 3: Closure

Problem. A session is not complete when code exists. It is complete when the work has been judged, the learning has been harvested, and the system knows what to do next.

Observable symptoms:

  • Work ends with a code diff but no extracted lesson
  • The next session starts without knowing what the last one changed
  • Teams still perform the refinement loop by hand: inspect, restate, retry

AgentOps mechanisms:

MechanismSourceRole
/post-mortemskills/post-mortem/SKILL.mdValidates shipped work, extracts learnings, and harvests next work
Finding registry + compiler pathdocs/contracts/finding-registry.md, docs/contracts/finding-compiler.md, ao findings / ao constraintPromotes reusable findings into advisory artifacts and active constraint index entries
Task-validation constraint execution/validate skill + ao constraint reading .agents/constraints/index.jsonTurns mechanically detectable findings into enforced validation checks before task completion
Flywheel close/post-mortem skill + docs/how-it-works.mdCloses the feedback loop at session end
GOALS + /evolveGOALS.md and /evolve flowsTurns findings into measurable next work instead of leaving them as loose notes
Ratchet + run registryao ratchet, .agents/rpi/next-work.jsonlRecords what passed, what remains, and what should be worked next
Phase chainingREADME.md full pipelineMakes research -> plan -> pre-mortem -> crank -> post-mortem the normal operating shape

Supporting failure modes addressed inside this gap:

  • knowledge decay after extraction because nothing reuses it
  • repeated human triage to decide "what did this teach us?"
  • completed work that never becomes better context or better constraints

Evidence Map

GapMechanismDurable Artifact / ContractProof Surface
Validation/pre-mortemskills/pre-mortem/SKILL.mdPlan review before implementation
Validation/validateskills/validate/SKILL.mdCode review before commit/merge (absorbs the retired /vibe)
Validationpre-mortem gate.github/workflows/validate.yml, /pre-mortem skillCI gate enforcement
Bookkeepingextraction + retrieval.agents/, ao lookup, ao forge, finding registry, finding artifactsRepo-specific context and reusable structured findings loaded into later sessions
Bookkeepingcurationao maturity, ao dedup, ao contradictFreshness, contradiction, and duplication control
BookkeepingCompileGOALS.md, Compile checksDaily maintenance of learning quality
Closure/post-mortem + finding compilerskills/post-mortem/SKILL.md, docs/contracts/finding-registry.md, docs/contracts/finding-compiler.mdLearnings + next work harvested from completed work; reusable findings re-enter planning/review and compile into preventive artifacts
Closuretask-validation compiled enforcement/validate skill, ao constraint, .agents/constraints/index.jsonTask-validation executes active compiled constraints before completion is accepted
Closureflywheel close/post-mortem skillSession-end closure of the feedback loop
Closuregoals / evolveGOALS.md, flywheel-proof gateProof that the system compounds across sessions

What AgentOps Does Not Claim

  • It does not claim that prompt engineering or routing are unimportant.
  • It does not claim that every loop-closing behavior must be fully autonomous.
  • It does not claim that raw recall alone is enough; the contract depends on validation, curation, and re-use.
  • It does not claim that new runtime machinery should be invented when an existing command, skill, or CI gate already covers the gap.

The Knowledge Ledger — Session-to-Session Flow

Session N ends
    → ao forge: mine transcript for learnings, decisions, patterns
    → ao notebook update: merge insights into MEMORY.md
    → ao memory sync: sync to repo-root MEMORY.md (cross-runtime)
    → ao maturity --expire: mark stale artifacts (freshness decay ~17%/week)
    → ao maturity --evict: archive what's decayed past threshold
    → ao feedback-loop: citation-to-utility feedback (MemRL)

Session N+1 starts
    → ao lookup (on demand): score artifacts by recency + utility
      ├── Local .agents/ learnings & patterns (1.0x weight)
      ├── Global ~/.agents/ cross-repo knowledge (0.8x weight)
      ├── Work-scoped boost: active issue gets 1.5x (--bead)
      ├── Predecessor handoff: what the last session was doing (--predecessor)
      └── Trim to ~1000 tokens — lightweight, not encyclopedic
    → Agent starts where the last one left off

Three tiers, descending priority: local .agents/ → global ~/.agents/ → legacy ~/.claude/patterns/. Each session starts with a small, curated packet — not a data dump. If the task needs deeper context, the agent searches .agents/ on demand.

See Also