Privacy & Data Handling

June 1, 2026 · View on GitHub

TL;DR — Memtrace runs entirely on your machine. Your source code never leaves it.

What Memtrace Does Locally

Memtrace builds a structural knowledge graph from your codebase's AST. Every step happens on your machine:

Step	Where it runs	What it processes
AST parsing	Local (Tree-sitter, compiled into the binary)	Source files → symbol nodes
Graph construction	Local (MemDB, embedded or self-hosted)	Nodes + edges (CALLS, IMPLEMENTS, IMPORTS)
Vector embeddings	Local (ONNX Runtime via fastembed — CoreML on Apple Silicon, CPU elsewhere)	Symbol signatures → vectors stored in local MemDB
Full-text search	Local (Tantivy BM25 index on disk)	Symbol names + signatures
Git history analysis	Local (libgit2, vendored)	Commit history → bi-temporal graph
MCP tool queries	Local (graph traversal + search)	Results returned to your local MCP client

No source code, file contents, symbol names, embeddings, file paths, or AST data is ever transmitted to any external server.

What Leaves Your Machine

Memtrace makes exactly three types of network calls:

1. License Authentication


Endpoint	`POST https://www.memtrace.io/api/device/auth`
Data sent	License key (`MTC-COM-...`) + machine hostname
Purpose	Validate your license and obtain a session token
Frequency	On startup; refresh when session nears expiry

2. Usage Heartbeat


Endpoint	`POST https://www.memtrace.io/api/device/heartbeat`
Data sent	Aggregate integer counts only: total nodes, edges, episodes, repositories
Purpose	Usage metering and entitlement checks
Frequency	Every 15 minutes while running

By default the heartbeat payload contains no symbol names, no file paths, no code, and no embeddings — only integer totals like { "totalNodes": 4022, "totalEdges": 18441 }.

The one exception is the Weekly Memtrace Receipt feature (off by default, opt-in via the memtrace.io account dashboard). When that toggle is on, the heartbeat additionally carries a small symbol-name surface that powers the weekly summary email. Set MEMTRACE_NO_REMOTE_RECEIPT=1 on a specific machine to keep the receipt feature off regardless of the account-level toggle. Full breakdown: docs/telemetry-compliance-datasheet.md §6.4.

3. Embedding Model Download (One-Time)


Source	HuggingFace Hub (via the `fastembed` library)
Data sent	Nothing — this is an inbound download only
What's downloaded	ONNX model weights (e.g., BGE-small-en-v1.5)
Frequency	Once on first run; cached at `~/.cache/fastembed/`

4. Product Telemetry (since v0.3.17)


Endpoint	`POST https://memtrace.io/api/telemetry/ingest`
Data sent	App-start events, indexing/embedding durations, aggregate PR review/watch counters, panic reports, and `WARN`/`ERROR` log lines from Memtrace's own crates — all sanitised to strip home-dir paths, token-shaped strings, and email addresses. Plus content-free Rail routing-quality buckets (mode, pattern shape, hit/miss, a bucketed score, and a local relevance yes/no) — never the search text or which files matched. The Rail buckets are measured asynchronously by the background daemon, so they never add latency to a search
Purpose	Catch crashes and regressions across the user base (the M3-Air "stuck on Loading embedding model" hang, Windows MSVC build failures, etc. are exactly the kind of thing this is for); and, for Rail, measure whether graph-backed search results are relevant — so the decision to make Rail active by default is backed by real evidence
Frequency	Batched flush every 60 seconds while running
Opt-out	`MEMTRACE_TELEMETRY=off` disables all of it (also `0`/`false`/`disabled`/`no`); `MEMTRACE_RAIL_SHADOW=off` disables just the Rail buckets; `MEMTRACE_RAIL_SHADOW_SAMPLE=0..1` bounds the background measurement rate

The telemetry payload never contains source code, file contents, symbol names, embeddings, repository paths, the text of your search commands, which files or symbols a search matched, GitHub PR URLs, PR discussion text, reviewer identities, branch names, or commit data. The schema on the receiving end has no column to hold any of those — we'd have to ship a new release to even start collecting them, and we'd announce it here first. Full breakdown: TELEMETRY.md.

What We Don't Do

❌ We do not send source code to any server
❌ We do not use cloud-based embedding APIs (OpenAI, Cohere, etc.)
❌ We do not transmit symbol names, file paths, or any structural data outside the sanitised crash/error/event payloads documented above
❌ We do not store or share IP addresses (standard request logs are kept 7 days for abuse mitigation only)
❌ We do not sell, share, or publish anonymised aggregates of telemetry data without notice

Questions?

If you have questions about data handling or need a security review for your organization, please open an issue or contact us at support@syncable.dev.