Memory System
June 6, 2026 · View on GitHub
The memory layer in Agenvoy has three tiers for conversation memory, a cross-session error memory tier, plus an optional tier — KuraDB — for external document RAG.
| Tier | Backed by | Scope |
|---|---|---|
| 1. Context window (16 messages + summary) | history.json + summary.json | session |
| 2. Semantic search (recent) | ToriiDB DBSessionHist (vector) | session |
| 3. Full-text archive (all history) | SQLite FTS5 via go-sqlite | session |
| Error memory | ToriiDB error_memory (90d TTL) | cross-session |
| External document RAG | KuraDB (in-process child) | cross-session, user-curated databases |
Three-tier conversation memory
1. Context window (max_history_messages, default 16)
Each session keeps the most recent N messages in full and feeds them into the LLM context window. Anything older remains in history.json but is not sent to the LLM.
A rolling summary (summary.json) condenses older conversations and is injected into the system prompt at the top of each turn so older context survives beyond the N-message window.
Incremental cursor: summary.meta.json (per-session) holds last_message_time (format YYYY-MM-DD HH:MM:SS, extracted from message content's opening 當前時間: ...). On each summary.Generate invocation:
filterAfterTime(histories, cursor)keeps onlyt > cursormessages- Each chunk runs one
generatePassLLM call (the system prompt already includes{{.Summary}}=old summary, so merge happens during generation — no separatemergePass, avoiding 2× cost) - On success, cursor advances to that chunk's max timestamp +
SaveSummarytriggers the mtime gate generatePassfailure →return(don't bill subsequent chunks; next cron tick retries)
2. Semantic search — ToriiDB (recent conversations)
The search_chat_history tool with mode=semantic runs vector similarity search via ToriiDB db.VSearch. Each hit triggers a context window expansion: 2 entries before + 1 entry after.
ToriiDB entries are cleaned during history.json compaction — entries older than the compact cutoff are removed, keeping ToriiDB focused on recent conversations. Older data lives in SQLite (tier 3).
3. Full-text archive — SQLite FTS5 (all history)
Every message written to history.json is dual-written to SQLite (~/.config/agenvoy/.store/history.db) via pardnchiu/go-sqlite. SQLite always holds the complete conversation history, even after history.json is compacted.
The search_chat_history tool with mode=keyword runs FTS5 full-text search on the SQLite archive + ToriiDB substring match on recent entries, combining results.
Compaction: when history.json exceeds max_history_bytes (default 5 MiB), the oldest messages are trimmed to 80% on a complete user+assistant pair boundary. The cutoff timestamp is recorded in SQLite session_meta.start_at so that keyword search excludes entries already present in history.json (avoiding duplicates). ToriiDB entries older than the cutoff are also removed.
Backfill: on first encounter (SQLite has no data for a session but history.json has content), the entire existing history is backfilled into SQLite.
Timestamps: stored as UTC unix nanoseconds. 當前時間: in message content is parsed via time.ParseInLocation (local timezone) and converted to UTC for storage. Search queries use time.Now().UnixNano() (already UTC).
Search routing
mode parameter | Source | Use case |
|---|---|---|
semantic (default) | ToriiDB VSearch | "What did we discuss about X?" — meaning-based |
keyword | SQLite FTS5 (archive) + ToriiDB substring (recent) | "Find messages containing 'sandbox'" — exact text |
Cross-session error memory
Tool failures, resolution paths, and abandoned strategies persist across sessions in error_memory with 90-day TTL. On hit (either via keyword Contains or db.VSearch), the entry's TTL is refreshed via db.Expire.
When the same tool name fails in a future session, toolCall.go automatically queries error_memory and injects relevant entries as hints into the next assistant turn:
| Outcome on record | Hint behavior |
|---|---|
resolved | Agent must apply the recorded resolution |
failed / abandoned | Agent must avoid the recorded strategy |
Storage layout
| Store | Content | Lifecycle |
|---|---|---|
history.json | Recent messages (hot, LLM reads every turn) | Auto-compacted at 5 MiB |
ToriiDB DBSessionHist | Recent messages with embeddings | Cleaned on compact (entries < cutoff removed) |
SQLite messages | All messages ever written (dual-write) | Cleared on reset / remove-session |
SQLite session_meta | start_at — compact cutoff timestamp | Cleared on reset / remove-session |
summary.json | Rolling summary blob | Survives reset |
ToriiDB error_memory | Tool error records with resolution metadata | 90d TTL (refresh on hit) |
Reset / remove behavior
| Operation | history.json | ToriiDB DBSessionHist | SQLite (messages + meta) | summary.json |
|---|---|---|---|---|
| Compact (auto) | Trimmed to 80% | Entries < cutoff removed | Untouched (already has all data) | Untouched |
Reset (/reset) | Deleted | Cleared | Cleared | Preserved |
| Remove session | Directory deleted | Cleared | Cleared | Directory deleted |
External document RAG (KuraDB)
The three tiers above all serve conversation memory — past chats, summaries, error records. For querying user-curated document collections (notes, inbox, code repos, …), Agenvoy delegates to KuraDB, an in-process child process spawned by the daemon when kuradb_enabled=true.
KuraDB exposes two tools to the agent (list_rag / search_rag), per-turn dynamically excluded when the endpoint file is missing. When loaded, the system prompt forces them to fire first for any information query (external web tools become gap-filling secondary).
This split is deliberate: ToriiDB + SQLite are integrated runtime memory (you can't disable them); KuraDB is an opt-in indexed knowledge base (enable via /kuradb TUI command).
Migration note
Sessions and error memory used to live under per-session JSON files. Since ToriiDB v0.5.0 they are inside the embedded store. Do not reintroduce JSON paths.
Note
This document was auto-generated by Claude after reading the full source code.