Privacy & Data Handling

June 1, 2026 · View on GitHub

TL;DR — Memtrace runs entirely on your machine. Your source code never leaves it.

What Memtrace Does Locally

Memtrace builds a structural knowledge graph from your codebase's AST. Every step happens on your machine:

StepWhere it runsWhat it processes
AST parsingLocal (Tree-sitter, compiled into the binary)Source files → symbol nodes
Graph constructionLocal (MemDB, embedded or self-hosted)Nodes + edges (CALLS, IMPLEMENTS, IMPORTS)
Vector embeddingsLocal (ONNX Runtime via fastembed — CoreML on Apple Silicon, CPU elsewhere)Symbol signatures → vectors stored in local MemDB
Full-text searchLocal (Tantivy BM25 index on disk)Symbol names + signatures
Git history analysisLocal (libgit2, vendored)Commit history → bi-temporal graph
MCP tool queriesLocal (graph traversal + search)Results returned to your local MCP client

No source code, file contents, symbol names, embeddings, file paths, or AST data is ever transmitted to any external server.

What Leaves Your Machine

Memtrace makes exactly three types of network calls:

1. License Authentication

EndpointPOST https://www.memtrace.io/api/device/auth
Data sentLicense key (MTC-COM-...) + machine hostname
PurposeValidate your license and obtain a session token
FrequencyOn startup; refresh when session nears expiry

2. Usage Heartbeat

EndpointPOST https://www.memtrace.io/api/device/heartbeat
Data sentAggregate integer counts only: total nodes, edges, episodes, repositories
PurposeUsage metering and entitlement checks
FrequencyEvery 15 minutes while running

By default the heartbeat payload contains no symbol names, no file paths, no code, and no embeddings — only integer totals like { "totalNodes": 4022, "totalEdges": 18441 }.

The one exception is the Weekly Memtrace Receipt feature (off by default, opt-in via the memtrace.io account dashboard). When that toggle is on, the heartbeat additionally carries a small symbol-name surface that powers the weekly summary email. Set MEMTRACE_NO_REMOTE_RECEIPT=1 on a specific machine to keep the receipt feature off regardless of the account-level toggle. Full breakdown: docs/telemetry-compliance-datasheet.md §6.4.

3. Embedding Model Download (One-Time)

SourceHuggingFace Hub (via the fastembed library)
Data sentNothing — this is an inbound download only
What's downloadedONNX model weights (e.g., BGE-small-en-v1.5)
FrequencyOnce on first run; cached at ~/.cache/fastembed/

4. Product Telemetry (since v0.3.17)

EndpointPOST https://memtrace.io/api/telemetry/ingest
Data sentApp-start events, indexing/embedding durations, aggregate PR review/watch counters, panic reports, and WARN/ERROR log lines from Memtrace's own crates — all sanitised to strip home-dir paths, token-shaped strings, and email addresses. Plus content-free Rail routing-quality buckets (mode, pattern shape, hit/miss, a bucketed score, and a local relevance yes/no) — never the search text or which files matched. The Rail buckets are measured asynchronously by the background daemon, so they never add latency to a search
PurposeCatch crashes and regressions across the user base (the M3-Air "stuck on Loading embedding model" hang, Windows MSVC build failures, etc. are exactly the kind of thing this is for); and, for Rail, measure whether graph-backed search results are relevant — so the decision to make Rail active by default is backed by real evidence
FrequencyBatched flush every 60 seconds while running
Opt-outMEMTRACE_TELEMETRY=off disables all of it (also 0/false/disabled/no); MEMTRACE_RAIL_SHADOW=off disables just the Rail buckets; MEMTRACE_RAIL_SHADOW_SAMPLE=0..1 bounds the background measurement rate

The telemetry payload never contains source code, file contents, symbol names, embeddings, repository paths, the text of your search commands, which files or symbols a search matched, GitHub PR URLs, PR discussion text, reviewer identities, branch names, or commit data. The schema on the receiving end has no column to hold any of those — we'd have to ship a new release to even start collecting them, and we'd announce it here first. Full breakdown: TELEMETRY.md.

What We Don't Do

  • ❌ We do not send source code to any server
  • ❌ We do not use cloud-based embedding APIs (OpenAI, Cohere, etc.)
  • ❌ We do not transmit symbol names, file paths, or any structural data outside the sanitised crash/error/event payloads documented above
  • ❌ We do not store or share IP addresses (standard request logs are kept 7 days for abuse mitigation only)
  • ❌ We do not sell, share, or publish anonymised aggregates of telemetry data without notice

Questions?

If you have questions about data handling or need a security review for your organization, please open an issue or contact us at support@syncable.dev.