Internals

May 9, 2026 · View on GitHub

Technical details on how FileScopeMCP works under the hood.

Dependency Detection

Import patterns detected per language:

LanguagePatterns
Pythonimport, from ... import
JavaScript / TypeScriptimport, require(), dynamic import()
C / C++#include
Rustuse, mod
Goimport with go.mod module resolution
Rubyrequire, require_relative with .rb probing
Luarequire
Zig@import
PHPrequire, require_once, include, include_once, use
C#using
Javaimport

Importance Calculation

Scores (0-10) from a weighted formula. The algorithm is versioned via importance_algorithm_version in kv_state — when the formula changes (new file types, weight tweaks, etc.), the coordinator auto-busts every file's cached score on the next init so reads never return values computed by an older version.

FactorMax contribution
Incoming dependents (files that import this file)+3
Outgoing dependencies (files this file imports)+2
Package dependencies imported+1
Location (src/, app/, lib/, core/ weighted higher)varies
Significant names (index, main, server, app, config, types, Makefile, CMakeLists, etc., case-insensitive)varies

File-type base scores (additive, current as of the latest algorithm version):

Extension(s)Base score
.ts, .tsx+3
.js, .jsx, .mjs, .cjs+2
.py, .go, .rb, .php+2
.c, .cpp, .cc, .cxx, .h, .hpp, .hh, .hxx+2
.sh, .sql+1
package.json, tsconfig.json, etc. (config JSON)+3; other .json +1
README.md (any case)+2; other .md +1
go.mod+3
Gemfile+3

Autonomous Update Pipeline

When a file event fires:

  1. Debounce — events coalesced per filePath:eventType key (default 2s)
  2. Mutex — all mutations serialized through AsyncMutex
  3. Semantic change detection — tree-sitter AST diff (TS/JS) or LLM-powered diff (all other languages) classifies the change
  4. Incremental update — re-parses the changed file, diffs dependency lists, patches reverse-dependency map, recalculates importance
  5. Cascade engine — BFS propagates staleness to transitive dependents if exports/types changed; body-only changes affect only the changed file
  6. LLM broker — picks up stale files and regenerates summaries, concepts, and change impact in priority order

Freshness Validation

Two complementary strategies:

  • Startup sweep — runs once at initialization. Compares every tracked file against the filesystem to detect adds, deletes, and modifications that occurred while the server was offline.
  • Per-file mtime check — when you call get_file_summary, the system compares current mtime against the last recorded value. If changed, the file is immediately flagged stale and queued for re-analysis.

Symbol Extraction

Tree-sitter AST parsing extracts top-level symbols (functions, classes, interfaces, types, enums, consts, modules, structs) from source files. Symbols are stored in the symbols table with name, kind, start/end line, export status, and owning file path.

Extraction runs per-language:

LanguageKinds extractedExport rule
TypeScript / JavaScriptfunction, class, interface, type, enum, constexport keyword
Pythonfunction, class (top-level only, decorator-aware)!name.startsWith('_')
Gofunction, method, struct, interface, type, constUppercase first char
Rubyfunction, class, module, constAlways exported (no keyword)

Ruby attr_accessor / attr_reader / attr_writer are not indexed (synthesized at runtime, not in AST). Reopened Ruby classes produce multiple symbol rows with the same name.

Call-Site Edges (TS/JS)

For TypeScript and JavaScript files, a second AST pass over the already-parsed tree extracts call expressions and resolves them to symbol-level edges in the symbol_dependencies table:

  1. Local resolution — callee name matches a symbol defined in the same file (confidence 1.0)
  2. Imported resolution — callee name matches a symbol imported from another file, verified against the DB (confidence 0.8)
  3. Unresolvable — silently discarded (no edge created)

Barrel files (index.ts etc.) are excluded to prevent over-matching. Ambiguous names (same name imported from multiple files) are discarded. Self-calls (recursion) are filtered from query results.

Call-site edges for Python, Go, and Ruby are not yet implemented.

Community Detection

Louvain clustering on the local import graph groups tightly-coupled files into communities. Each community is represented by its highest-importance member. Communities are lazily recomputed only when the dependency graph changes (dirty flag tracked in DB).

Cycle Detection

  1. Loads all local import edges from SQLite in a single batch query
  2. Runs iterative Tarjan's SCC algorithm on the directed dependency graph
  3. Filters out trivial SCCs (single files with no self-loop)
  4. Returns cycle groups listing all participating files

Storage

All data in .filescope/data.db (SQLite, WAL mode):

TablePurpose
filesMetadata, staleness flags, summary, concepts, change_impact
file_dependenciesDirected import edges with edge type, confidence, and weight
symbolsExtracted symbols (name, kind, startLine, endLine, isExport) per file
symbol_dependenciesCall-site edges between symbols (caller → callee with confidence)
file_communitiesLouvain community assignments
kv_stateKey-value store for bulk migration gates and feature flags
schema_versionMigration versioning

Auto-migration: on first run, any legacy JSON tree files are imported into SQLite automatically. Schema migrations run automatically on startup.

LLM Broker Architecture

The broker is a standalone Node.js process that owns all LLM communication (llama.cpp's llama-server or any OpenAI-compatible HTTP API):

  • IPC — Unix domain socket at ~/.filescope/broker.sock, NDJSON protocol
  • Queue — in-memory priority queue (importance DESC, created_at ASC)
  • Tiers — interactive (tier 1) > cascade (tier 2) > background (tier 3)
  • Dedup — one pending job per file+type per repo, latest content wins
  • TimeoutjobTimeoutMs, schema default 120 s; shipped broker.default.json template overrides to 300 s, so fresh installs run at 5 min until the user edits the field
  • Auto-spawn — first MCP instance spawns the broker if broker.sock is missing
  • Stats — per-repo token totals persisted to ~/.filescope/stats.json