R-Mem

April 13, 2026 · View on GitHub

R-Mem

Long-term memory for AI agents — in Rust

A lightweight study of mem0's memory architecture.
Single binary. SQLite-backed. No Python.

3.6 MB binary · 2,826 lines of Rust · < 10 MB RAM · SQLite only · MCP ready · LongMemEval 48.2%

Quick Start · Integration · How It Works · Usage · MCP · Performance · Architecture · Roadmap

Note

This project reimplements mem0's elegant memory architecture in Rust as a learning exercise. Full credit to the mem0 team for the original design. This is not a replacement — it's a study of their approach using a different language. Discussions, ideas, and contributions are welcome!

Why R-Mem?

mem0 is a well-designed memory system with a rich plugin ecosystem. R-Mem asks a narrower question: what if we rewrite just the core memory logic in Rust, backed entirely by SQLite?

The result is the same three-tier architecture — vector memory, graph memory, history — plus a tiered archive system, in 2,826 lines of Rust. No external services. One binary. The trade-off is clear: far fewer integrations, but near-zero operational overhead.

R-Mem was born out of RustClaw — our minimalist Rust AI agent framework. RustClaw needed a memory layer that matched its philosophy: single binary, zero external services. So we studied mem0's architecture and rebuilt it in Rust.

	R-Mem	mem0
📦 Binary	3.6 MB static	Python + pip (rich ecosystem)
💾 Idle RSS	< 10 MB	200 MB+ (more features loaded)
📝 Code	2,826 lines	~91,500 lines (26+ store drivers)
🔍 Vector	SQLite + FTS5	Qdrant, Chroma, Pinecone, …
🕸️ Graph	SQLite only	Neo4j / Memgraph
🤖 LLM	OpenAI, Anthropic, Ollama	OpenAI, Anthropic, and more
🗄️ Archive	Tiered memory with fallback	—

mem0's numbers reflect its richer ecosystem — more stores, more integrations, more flexibility. R-Mem intentionally trades that for a minimal footprint.

What R-Mem adds beyond mem0

Feature	R-Mem	mem0
Tiered Archive	Deleted/updated memories preserved + fallback search	Gone when deleted
FTS5 Pre-filter	Two-stage search: keyword → vector (19x faster)	Vector-only
MCP Server	Built-in, `rustmem mcp` for Claude Code / Cursor	Not available
Zero-dependency deploy	Single binary, SQLite, no Docker	Python + pip + vector DB + graph DB
Anthropic native	Direct Claude API support	Via OpenAI-compatible proxy
Configurable pipeline	`[memory]` section: thresholds, limits, all tunable	Hardcoded defaults
Memory categories	Auto-classified: preference, personal, plan, professional, health	Unstructured

🔍 How It Works

Input text
│
├─ 📦 Vector Memory ──────────────────────────────────
│    │
│    ├─ LLM extracts facts
│    │    → ["Name is Alice", "Works at Google"]
│    │
│    ├─ Embedding → cosine similarity search
│    │    (FTS5 pre-filter + vector ranking)
│    │
│    ├─ Integer ID mapping
│    │    (prevents LLM UUID hallucination)
│    │
│    ├─ LLM decides per fact:
│    │    ├─ ADD       new information
│    │    ├─ UPDATE    more specific → old version archived
│    │    ├─ DELETE    contradiction → old version archived
│    │    └─ NONE      duplicate — skip
│    │
│    └─ Execute actions + write history
│
├─ 🕸️ Graph Memory ──────────────────────────────────
│    │
│    ├─ LLM extracts entities + relations
│    ├─ Conflict detection (soft-delete old, add new)
│    └─ Multi-value vs single-value handling
│
└─ 🗄️ Archive ───────────────────────────────────────
     │
     ├─ Deleted/superseded memories preserved with embeddings
     ├─ Fallback search when active results are weak
     └─ Auto-compaction when archive exceeds threshold

🚀 Quick Start

Prerequisites

Requirement	Install
Rust 1.75+	`curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \| sh`
LLM backend	Ollama, OpenAI, or Anthropic

Install

cargo install rustmem

Or build from source:

git clone https://github.com/Adaimade/R-Mem.git && cd R-Mem
cargo build --release
# → target/release/rustmem (3.6 MB)

Configure

Create rustmem.toml in the project root:

Ollama (local) OpenAI Anthropic

[llm]
provider = "openai"
base_url = "http://127.0.0.1:11434"
model = "qwen2.5:32b"

[embedding]
provider = "openai"
base_url = "http://127.0.0.1:11434"
model = "nomic-embed-text"

[llm]
provider = "openai"
api_key = "sk-..."
model = "gpt-4o"

[embedding]
provider = "openai"
api_key = "sk-..."
model = "text-embedding-3-small"

[llm]
provider = "anthropic"
api_key = "sk-ant-..."
model = "claude-sonnet-4-6"

[embedding]
provider = "openai"
api_key = "sk-..."
model = "text-embedding-3-small"

Note: Anthropic does not provide embedding models, so [embedding] uses OpenAI or Ollama even when [llm] uses Anthropic.

Security: R-Mem binds to 127.0.0.1 by default (localhost only). Never put API keys in code — use rustmem.toml (gitignored) or environment variables (RUSTMEM__LLM__API_KEY).

🔗 Integration Guide

⚠️ Building a MemoryManager is not enough

The most common integration mistake: you initialize MemoryManager, but never call add() or search() in your conversation loop. The memory system exists but is never used — nothing the user says gets remembered.

The correct conversation loop

Every turn must include two memory operations:

Before LLM call — RECALL (search relevant memories)
After LLM call — LEARN (extract and store new facts)

loop {
    user_message = receive()

    // 1. RECALL — before calling the LLM
    memories = rmem.search(user_id, user_message, limit=10)
    context = format_as_context(memories)

    // 2. Call LLM with memory context
    response = llm.chat(system_prompt + context + user_message)

    // 3. LEARN — after responding
    rmem.add(user_id, user_message)

    send(response)
}

Memory context format

Format search() results as context the LLM can understand:

[Memory]
Known facts about this user:
- User's name is Alice
- User prefers dark mode
- User is working on a Rust project

Place this in the system prompt or before the user message so the LLM can reference it.

Multi-scope pattern

If your app serves multiple channels (e.g. Telegram + Discord), use three scope layers:

Scope	Purpose	Example ID
local	Single conversation / group	`telegram:group_123`
user	Cross-channel personal memory	`user:456`
global	Shared across all users	`global:system`

Merge results at recall time:

local_results  = search("telegram:group_123", query)
user_results   = search("user:456", query)
global_results = search("global:system", query)
all = deduplicate(local + user + global)

Common mistakes

❌ Initialize MemoryManager but never call search() / add() in the loop
❌ Only LEARN without RECALL (memories stored but never retrieved)
❌ Only RECALL without LEARN (reads old memories but never learns new ones)
❌ Put add() before the LLM call (current message gets treated as known fact)

📖 Usage

CLI

# Add memories
rustmem add -u alice "My name is Alice and I work at Google. I love sushi."

# Semantic search
rustmem search -u alice "What does Alice eat?"

# List all memories for a user
rustmem list -u alice

# Show graph relations
rustmem graph -u alice

# Start REST API server
rustmem server

REST API

Start with rustmem server, then:

# ➕ Add memory
curl -X POST http://localhost:8019/memories/add \
  -H 'Content-Type: application/json' \
  -d '{"user_id": "alice", "text": "I moved to Tokyo last month"}'

# 🔍 Search
curl -X POST http://localhost:8019/memories/search \
  -H 'Content-Type: application/json' \
  -d '{"user_id": "alice", "query": "where does she live", "limit": 5}'

# 📋 List all
curl http://localhost:8019/memories?user_id=alice

# 🏷️ Filter by category (preference, personal, plan, professional, health, misc)
curl http://localhost:8019/memories?user_id=alice&category=preference

# 🗑️ Delete
curl -X DELETE http://localhost:8019/memories/{id}

# 📜 History
curl http://localhost:8019/memories/{id}/history

# 🗄️ View archived memories
curl http://localhost:8019/archive?user_id=alice

# 🕸️ View graph relations
curl http://localhost:8019/graph?user_id=alice

Drop-in for AI Agents

# mem0 (before)
from mem0 import Memory
m = Memory()
m.add("Alice loves sushi", user_id="alice")

# R-Mem (after — just switch to HTTP)
import httpx
httpx.post("http://localhost:8019/memories/add",
    json={"user_id": "alice", "text": "Alice loves sushi"})

🔌 MCP Server

R-Mem works as an MCP server — give Claude Code or Cursor long-term memory with one command:

# Claude Code
claude mcp add rustmem -- /path/to/rustmem mcp

# Cursor (.cursor/mcp.json)
{
  "mcpServers": {
    "rustmem": {
      "command": "/path/to/rustmem",
      "args": ["mcp"]
    }
  }
}

7 tools available: add_memory, search_memory, list_memories, get_memory, delete_memory, get_graph, reset_memories

⚡ Performance

Benchmarked on Apple Silicon with 10,000 memories (768-dim embeddings):

Operation	Time	Notes
Write	36 µs/record	10K records in 360ms
Brute-force search	35.8 ms	Scans all 10K embeddings
FTS5 + vector search	1.9 ms	19x faster — pre-filters then re-ranks
Concurrent reads	2.4 ms/thread	10 threads, WAL mode, no blocking
Storage	4.2 KB/memory	10K memories = 40 MB

Run the benchmark yourself:

cargo bench --bench store_bench

LongMemEval

LongMemEval (ICLR 2025) — 500 questions testing long-term memory across 5 capabilities:

System	Score	Notes
agentmemory	96.2%	RAG (stores raw text)
MemLayer	94.4%	RAG (layered index)
Zep	63.8%	RAG + summary
mem0	~49%	Fact extraction (gpt-4o)
R-Mem	48.2%	Fact extraction (gpt-4o-mini)

R-Mem nearly matches mem0 using a 20x cheaper model. The gap vs RAG systems is architectural — R-Mem extracts and deduplicates facts rather than storing raw text, which trades verbatim recall for efficient long-term knowledge management.

🏗️ Architecture

src/
├── main.rs          CLI entry point (clap)
├── config.rs        TOML + env var config
├── server.rs        REST API (axum)
├── mcp.rs           MCP server (rmcp) — 7 tools over stdio
├── memory.rs        Core orchestrator — tiered memory pipeline
├── extract.rs       LLM calls: OpenAI + Anthropic native
├── embedding.rs     OpenAI-compatible embedding client
├── store.rs         SQLite vector store (WAL + FTS5 + archive)
└── graph.rs         SQLite graph store (soft-delete, multi-value)

9 files. 2,826 lines. 3.6 MB binary. Zero external services.

🗺️ Roadmap

Status	Feature	Description
✅	Published on crates.io	`cargo install rustmem` — one-line install
✅	MCP Server	`rustmem mcp` — 7 tools over stdio for Claude Code / Cursor
✅	Tiered Archive	Deleted/updated memories preserved + fallback search
✅	FTS5 Two-Stage Search	Keyword pre-filter + vector re-rank — 19x faster
✅	Memory Categories	Auto-classified: preference, personal, plan, professional, health
✅	Anthropic Native	Direct Claude API support (no proxy needed)
✅	Agent SDK (lib crate)	Use `rustmem::{memory, store, graph}` directly in your Rust code
✅	LongMemEval Benchmark	48.2% with gpt-4o-mini, nearly matching mem0 (~49%)
✅	Production Audit	11 security/stability fixes, 25 unit tests, cargo bench
🔲	Episodic Memory	Task execution history (tool calls, params, results)
🔲	User Preference Model	Cross-session user style and behavior modeling
🔲	Skill Abstraction	Auto-extract repeated successful patterns into skills
🔲	Batch Import	Load existing mem0 exports
🔲	Multi-modal	Image / audio memory support
🔲	Dashboard	Lightweight web UI for memory inspection

R-Mem v0.3.0 is feature-complete as a learning project. The core architecture is stable and production-hardened. Community contributions, forks, and explorations are welcome — open an issue or PR.

MIT License · v0.3.0

Created by Ad Huang with Claude Code