Internals
May 9, 2026 · View on GitHub
Technical details on how FileScopeMCP works under the hood.
Dependency Detection
Import patterns detected per language:
| Language | Patterns |
|---|---|
| Python | import, from ... import |
| JavaScript / TypeScript | import, require(), dynamic import() |
| C / C++ | #include |
| Rust | use, mod |
| Go | import with go.mod module resolution |
| Ruby | require, require_relative with .rb probing |
| Lua | require |
| Zig | @import |
| PHP | require, require_once, include, include_once, use |
| C# | using |
| Java | import |
Importance Calculation
Scores (0-10) from a weighted formula. The algorithm is versioned via
importance_algorithm_version in kv_state — when the formula changes
(new file types, weight tweaks, etc.), the coordinator auto-busts every
file's cached score on the next init so reads never return values
computed by an older version.
| Factor | Max contribution |
|---|---|
| Incoming dependents (files that import this file) | +3 |
| Outgoing dependencies (files this file imports) | +2 |
| Package dependencies imported | +1 |
Location (src/, app/, lib/, core/ weighted higher) | varies |
Significant names (index, main, server, app, config, types, Makefile, CMakeLists, etc., case-insensitive) | varies |
File-type base scores (additive, current as of the latest algorithm version):
| Extension(s) | Base score |
|---|---|
.ts, .tsx | +3 |
.js, .jsx, .mjs, .cjs | +2 |
.py, .go, .rb, .php | +2 |
.c, .cpp, .cc, .cxx, .h, .hpp, .hh, .hxx | +2 |
.sh, .sql | +1 |
package.json, tsconfig.json, etc. (config JSON) | +3; other .json +1 |
README.md (any case) | +2; other .md +1 |
go.mod | +3 |
Gemfile | +3 |
Autonomous Update Pipeline
When a file event fires:
- Debounce — events coalesced per
filePath:eventTypekey (default 2s) - Mutex — all mutations serialized through
AsyncMutex - Semantic change detection — tree-sitter AST diff (TS/JS) or LLM-powered diff (all other languages) classifies the change
- Incremental update — re-parses the changed file, diffs dependency lists, patches reverse-dependency map, recalculates importance
- Cascade engine — BFS propagates staleness to transitive dependents if exports/types changed; body-only changes affect only the changed file
- LLM broker — picks up stale files and regenerates summaries, concepts, and change impact in priority order
Freshness Validation
Two complementary strategies:
- Startup sweep — runs once at initialization. Compares every tracked file against the filesystem to detect adds, deletes, and modifications that occurred while the server was offline.
- Per-file mtime check — when you call
get_file_summary, the system compares current mtime against the last recorded value. If changed, the file is immediately flagged stale and queued for re-analysis.
Symbol Extraction
Tree-sitter AST parsing extracts top-level symbols (functions, classes, interfaces, types, enums, consts, modules, structs) from source files. Symbols are stored in the symbols table with name, kind, start/end line, export status, and owning file path.
Extraction runs per-language:
| Language | Kinds extracted | Export rule |
|---|---|---|
| TypeScript / JavaScript | function, class, interface, type, enum, const | export keyword |
| Python | function, class (top-level only, decorator-aware) | !name.startsWith('_') |
| Go | function, method, struct, interface, type, const | Uppercase first char |
| Ruby | function, class, module, const | Always exported (no keyword) |
Ruby attr_accessor / attr_reader / attr_writer are not indexed (synthesized at runtime, not in AST). Reopened Ruby classes produce multiple symbol rows with the same name.
Call-Site Edges (TS/JS)
For TypeScript and JavaScript files, a second AST pass over the already-parsed tree extracts call expressions and resolves them to symbol-level edges in the symbol_dependencies table:
- Local resolution — callee name matches a symbol defined in the same file (confidence 1.0)
- Imported resolution — callee name matches a symbol imported from another file, verified against the DB (confidence 0.8)
- Unresolvable — silently discarded (no edge created)
Barrel files (index.ts etc.) are excluded to prevent over-matching. Ambiguous names (same name imported from multiple files) are discarded. Self-calls (recursion) are filtered from query results.
Call-site edges for Python, Go, and Ruby are not yet implemented.
Community Detection
Louvain clustering on the local import graph groups tightly-coupled files into communities. Each community is represented by its highest-importance member. Communities are lazily recomputed only when the dependency graph changes (dirty flag tracked in DB).
Cycle Detection
- Loads all local import edges from SQLite in a single batch query
- Runs iterative Tarjan's SCC algorithm on the directed dependency graph
- Filters out trivial SCCs (single files with no self-loop)
- Returns cycle groups listing all participating files
Storage
All data in .filescope/data.db (SQLite, WAL mode):
| Table | Purpose |
|---|---|
files | Metadata, staleness flags, summary, concepts, change_impact |
file_dependencies | Directed import edges with edge type, confidence, and weight |
symbols | Extracted symbols (name, kind, startLine, endLine, isExport) per file |
symbol_dependencies | Call-site edges between symbols (caller → callee with confidence) |
file_communities | Louvain community assignments |
kv_state | Key-value store for bulk migration gates and feature flags |
schema_version | Migration versioning |
Auto-migration: on first run, any legacy JSON tree files are imported into SQLite automatically. Schema migrations run automatically on startup.
LLM Broker Architecture
The broker is a standalone Node.js process that owns all LLM communication (llama.cpp's llama-server or any OpenAI-compatible HTTP API):
- IPC — Unix domain socket at
~/.filescope/broker.sock, NDJSON protocol - Queue — in-memory priority queue (importance DESC, created_at ASC)
- Tiers — interactive (tier 1) > cascade (tier 2) > background (tier 3)
- Dedup — one pending job per file+type per repo, latest content wins
- Timeout —
jobTimeoutMs, schema default 120 s; shippedbroker.default.jsontemplate overrides to 300 s, so fresh installs run at 5 min until the user edits the field - Auto-spawn — first MCP instance spawns the broker if
broker.sockis missing - Stats — per-repo token totals persisted to
~/.filescope/stats.json