R-Mem

April 13, 2026 · View on GitHub

R-Mem

Long-term memory for AI agents — in Rust

A lightweight study of mem0's memory architecture.
Single binary. SQLite-backed. No Python.

License: MIT Rust Crates.io Built with Claude Code Awesome SQLite

3.6 MB binary · 2,826 lines of Rust · < 10 MB RAM · SQLite only · MCP ready · LongMemEval 48.2%

Quick Start · Integration · How It Works · Usage · MCP · Performance · Architecture · Roadmap

🌐 繁體中文 · 简体中文 · 日本語 · 한국어

Note

This project reimplements mem0's elegant memory architecture in Rust as a learning exercise. Full credit to the mem0 team for the original design. This is not a replacement — it's a study of their approach using a different language. Discussions, ideas, and contributions are welcome!


Why R-Mem?

mem0 is a well-designed memory system with a rich plugin ecosystem. R-Mem asks a narrower question: what if we rewrite just the core memory logic in Rust, backed entirely by SQLite?

The result is the same three-tier architecture — vector memory, graph memory, history — plus a tiered archive system, in 2,826 lines of Rust. No external services. One binary. The trade-off is clear: far fewer integrations, but near-zero operational overhead.

R-Mem was born out of RustClaw — our minimalist Rust AI agent framework. RustClaw needed a memory layer that matched its philosophy: single binary, zero external services. So we studied mem0's architecture and rebuilt it in Rust.

R-Memmem0
📦 Binary3.6 MB staticPython + pip (rich ecosystem)
💾 Idle RSS< 10 MB200 MB+ (more features loaded)
📝 Code2,826 lines~91,500 lines (26+ store drivers)
🔍 VectorSQLite + FTS5Qdrant, Chroma, Pinecone, …
🕸️ GraphSQLite onlyNeo4j / Memgraph
🤖 LLMOpenAI, Anthropic, OllamaOpenAI, Anthropic, and more
🗄️ ArchiveTiered memory with fallback

mem0's numbers reflect its richer ecosystem — more stores, more integrations, more flexibility. R-Mem intentionally trades that for a minimal footprint.

What R-Mem adds beyond mem0

FeatureR-Memmem0
Tiered ArchiveDeleted/updated memories preserved + fallback searchGone when deleted
FTS5 Pre-filterTwo-stage search: keyword → vector (19x faster)Vector-only
MCP ServerBuilt-in, rustmem mcp for Claude Code / CursorNot available
Zero-dependency deploySingle binary, SQLite, no DockerPython + pip + vector DB + graph DB
Anthropic nativeDirect Claude API supportVia OpenAI-compatible proxy
Configurable pipeline[memory] section: thresholds, limits, all tunableHardcoded defaults
Memory categoriesAuto-classified: preference, personal, plan, professional, healthUnstructured

🔍 How It Works

Input text

├─ 📦 Vector Memory ──────────────────────────────────
│    │
│    ├─ LLM extracts facts
│    │    → ["Name is Alice", "Works at Google"]
│    │
│    ├─ Embedding → cosine similarity search
│    │    (FTS5 pre-filter + vector ranking)
│    │
│    ├─ Integer ID mapping
│    │    (prevents LLM UUID hallucination)
│    │
│    ├─ LLM decides per fact:
│    │    ├─ ADD       new information
│    │    ├─ UPDATE    more specific → old version archived
│    │    ├─ DELETE    contradiction → old version archived
│    │    └─ NONE      duplicate — skip
│    │
│    └─ Execute actions + write history

├─ 🕸️ Graph Memory ──────────────────────────────────
│    │
│    ├─ LLM extracts entities + relations
│    ├─ Conflict detection (soft-delete old, add new)
│    └─ Multi-value vs single-value handling

└─ 🗄️ Archive ───────────────────────────────────────

     ├─ Deleted/superseded memories preserved with embeddings
     ├─ Fallback search when active results are weak
     └─ Auto-compaction when archive exceeds threshold

🚀 Quick Start

Prerequisites

RequirementInstall
Rust 1.75+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
LLM backendOllama, OpenAI, or Anthropic

Install

cargo install rustmem

Or build from source:

git clone https://github.com/Adaimade/R-Mem.git && cd R-Mem
cargo build --release
# → target/release/rustmem (3.6 MB)

Configure

Create rustmem.toml in the project root:

Ollama (local) OpenAI Anthropic
[llm]
provider = "openai"
base_url = "http://127.0.0.1:11434"
model = "qwen2.5:32b"

[embedding]
provider = "openai"
base_url = "http://127.0.0.1:11434"
model = "nomic-embed-text"
[llm]
provider = "openai"
api_key = "sk-..."
model = "gpt-4o"

[embedding]
provider = "openai"
api_key = "sk-..."
model = "text-embedding-3-small"
[llm]
provider = "anthropic"
api_key = "sk-ant-..."
model = "claude-sonnet-4-6"

[embedding]
provider = "openai"
api_key = "sk-..."
model = "text-embedding-3-small"

Note: Anthropic does not provide embedding models, so [embedding] uses OpenAI or Ollama even when [llm] uses Anthropic.

Security: R-Mem binds to 127.0.0.1 by default (localhost only). Never put API keys in code — use rustmem.toml (gitignored) or environment variables (RUSTMEM__LLM__API_KEY).


🔗 Integration Guide

⚠️ Building a MemoryManager is not enough

The most common integration mistake: you initialize MemoryManager, but never call add() or search() in your conversation loop. The memory system exists but is never used — nothing the user says gets remembered.

The correct conversation loop

Every turn must include two memory operations:

  1. Before LLM call — RECALL (search relevant memories)
  2. After LLM call — LEARN (extract and store new facts)
loop {
    user_message = receive()

    // 1. RECALL — before calling the LLM
    memories = rmem.search(user_id, user_message, limit=10)
    context = format_as_context(memories)

    // 2. Call LLM with memory context
    response = llm.chat(system_prompt + context + user_message)

    // 3. LEARN — after responding
    rmem.add(user_id, user_message)

    send(response)
}

Memory context format

Format search() results as context the LLM can understand:

[Memory]
Known facts about this user:
- User's name is Alice
- User prefers dark mode
- User is working on a Rust project

Place this in the system prompt or before the user message so the LLM can reference it.

Multi-scope pattern

If your app serves multiple channels (e.g. Telegram + Discord), use three scope layers:

ScopePurposeExample ID
localSingle conversation / grouptelegram:group_123
userCross-channel personal memoryuser:456
globalShared across all usersglobal:system

Merge results at recall time:

local_results  = search("telegram:group_123", query)
user_results   = search("user:456", query)
global_results = search("global:system", query)
all = deduplicate(local + user + global)

Common mistakes

  • ❌ Initialize MemoryManager but never call search() / add() in the loop
  • ❌ Only LEARN without RECALL (memories stored but never retrieved)
  • ❌ Only RECALL without LEARN (reads old memories but never learns new ones)
  • ❌ Put add() before the LLM call (current message gets treated as known fact)

📖 Usage

CLI

# Add memories
rustmem add -u alice "My name is Alice and I work at Google. I love sushi."

# Semantic search
rustmem search -u alice "What does Alice eat?"

# List all memories for a user
rustmem list -u alice

# Show graph relations
rustmem graph -u alice

# Start REST API server
rustmem server

REST API

Start with rustmem server, then:

# ➕ Add memory
curl -X POST http://localhost:8019/memories/add \
  -H 'Content-Type: application/json' \
  -d '{"user_id": "alice", "text": "I moved to Tokyo last month"}'

# 🔍 Search
curl -X POST http://localhost:8019/memories/search \
  -H 'Content-Type: application/json' \
  -d '{"user_id": "alice", "query": "where does she live", "limit": 5}'

# 📋 List all
curl http://localhost:8019/memories?user_id=alice

# 🏷️ Filter by category (preference, personal, plan, professional, health, misc)
curl http://localhost:8019/memories?user_id=alice&category=preference

# 🗑️ Delete
curl -X DELETE http://localhost:8019/memories/{id}

# 📜 History
curl http://localhost:8019/memories/{id}/history

# 🗄️ View archived memories
curl http://localhost:8019/archive?user_id=alice

# 🕸️ View graph relations
curl http://localhost:8019/graph?user_id=alice

Drop-in for AI Agents

# mem0 (before)
from mem0 import Memory
m = Memory()
m.add("Alice loves sushi", user_id="alice")

# R-Mem (after — just switch to HTTP)
import httpx
httpx.post("http://localhost:8019/memories/add",
    json={"user_id": "alice", "text": "Alice loves sushi"})

🔌 MCP Server

R-Mem works as an MCP server — give Claude Code or Cursor long-term memory with one command:

# Claude Code
claude mcp add rustmem -- /path/to/rustmem mcp

# Cursor (.cursor/mcp.json)
{
  "mcpServers": {
    "rustmem": {
      "command": "/path/to/rustmem",
      "args": ["mcp"]
    }
  }
}

7 tools available: add_memory, search_memory, list_memories, get_memory, delete_memory, get_graph, reset_memories


⚡ Performance

Benchmarked on Apple Silicon with 10,000 memories (768-dim embeddings):

OperationTimeNotes
Write36 µs/record10K records in 360ms
Brute-force search35.8 msScans all 10K embeddings
FTS5 + vector search1.9 ms19x faster — pre-filters then re-ranks
Concurrent reads2.4 ms/thread10 threads, WAL mode, no blocking
Storage4.2 KB/memory10K memories = 40 MB

Run the benchmark yourself:

cargo bench --bench store_bench

LongMemEval

LongMemEval (ICLR 2025) — 500 questions testing long-term memory across 5 capabilities:

SystemScoreNotes
agentmemory96.2%RAG (stores raw text)
MemLayer94.4%RAG (layered index)
Zep63.8%RAG + summary
mem0~49%Fact extraction (gpt-4o)
R-Mem48.2%Fact extraction (gpt-4o-mini)

R-Mem nearly matches mem0 using a 20x cheaper model. The gap vs RAG systems is architectural — R-Mem extracts and deduplicates facts rather than storing raw text, which trades verbatim recall for efficient long-term knowledge management.


🏗️ Architecture

src/
├── main.rs          CLI entry point (clap)
├── config.rs        TOML + env var config
├── server.rs        REST API (axum)
├── mcp.rs           MCP server (rmcp) — 7 tools over stdio
├── memory.rs        Core orchestrator — tiered memory pipeline
├── extract.rs       LLM calls: OpenAI + Anthropic native
├── embedding.rs     OpenAI-compatible embedding client
├── store.rs         SQLite vector store (WAL + FTS5 + archive)
└── graph.rs         SQLite graph store (soft-delete, multi-value)

9 files. 2,826 lines. 3.6 MB binary. Zero external services.


🗺️ Roadmap

StatusFeatureDescription
Published on crates.iocargo install rustmem — one-line install
MCP Serverrustmem mcp — 7 tools over stdio for Claude Code / Cursor
Tiered ArchiveDeleted/updated memories preserved + fallback search
FTS5 Two-Stage SearchKeyword pre-filter + vector re-rank — 19x faster
Memory CategoriesAuto-classified: preference, personal, plan, professional, health
Anthropic NativeDirect Claude API support (no proxy needed)
Agent SDK (lib crate)Use rustmem::{memory, store, graph} directly in your Rust code
LongMemEval Benchmark48.2% with gpt-4o-mini, nearly matching mem0 (~49%)
Production Audit11 security/stability fixes, 25 unit tests, cargo bench
🔲Episodic MemoryTask execution history (tool calls, params, results)
🔲User Preference ModelCross-session user style and behavior modeling
🔲Skill AbstractionAuto-extract repeated successful patterns into skills
🔲Batch ImportLoad existing mem0 exports
🔲Multi-modalImage / audio memory support
🔲DashboardLightweight web UI for memory inspection

R-Mem v0.3.0 is feature-complete as a learning project. The core architecture is stable and production-hardened. Community contributions, forks, and explorations are welcome — open an issue or PR.


MIT License · v0.3.0

Created by Ad Huang with Claude Code