FileScopeMCP

May 10, 2026 · View on GitHub

You are in the FileScopeMCP repository — an MCP server that indexes codebases for autonomous AI agents (Hermes, OpenClaw, Claude Code, Codex, and any other MCP-capable runtime).

What This Is

FileScopeMCP watches source code, ranks every file by importance (0-10), maps all import dependencies bidirectionally, extracts symbols (functions, classes, types) via tree-sitter, and maintains LLM-generated summaries. It exposes this intelligence as MCP tools so an agent can understand a codebase without reading every file.

Supports: TypeScript, JavaScript, Python, C, C++, Rust, Go, Ruby, Lua, Zig, PHP, C#, Java.

If You Want to USE FileScopeMCP in Another Project

You don't need to be in this repo. Register it with your agent runtime, then point it at a project with set_base_directory(path) (or just launch the server from the project directory — it auto-initializes to its launch cwd).

Stdio transport (all runtimes)

command: node
args: ["/path/to/FileScopeMCP/dist/mcp-server.js"]

Hermes (~/.hermes/config.yaml)

mcp_servers:
  filescope:
    command: "node"
    args: ["/path/to/FileScopeMCP/dist/mcp-server.js"]
    timeout: 120
    connect_timeout: 60

After editing, restart Hermes so it re-reads mcp_servers. Verify the server appears in Hermes's tool list and that status() returns initialized: true.

Claude Code

Already registered if ./build.sh was run. Verify with claude mcp list.

For an opinionated install that includes a project priming CLAUDE.md and pointers to optional hook templates: npm run install-claude-code. Layered — never modifies .claude/settings.json. See ROADMAP.md Phase 1 for design notes and docs/claude-code-hooks.md for the hook snippets.

Cursor / other MCP clients

See docs/mcp-clients.md for per-client config snippets.

Connecting a Local LLM (Broker Setup)

FileScopeMCP runs a broker that queues LLM work (file summarization, change analysis). The broker talks to any OpenAI-compatible HTTP endpoint. Most agent runtimes already have a local LLM running — point the broker at the same endpoint and you avoid running two LLM servers.

The broker reads ~/.filescope/broker.json (created automatically on first run from broker.default.json). Schema:

{
  "llm": {
    "provider": "openai-compatible",
    "model": "llm-model",
    "baseURL": "http://localhost:8880/v1",
    "maxTokensPerCall": 1024
  },
  "jobTimeoutMs": 300000,
  "maxQueueSize": 1000
}

Set baseURL to wherever the LLM is listening, and model to whatever alias the server expects (check /v1/models if you're not sure). The broker handles queuing and prioritization — interactive tool calls beat background summarization, and it will not flood the LLM with hundreds of concurrent requests.

If ~/.filescope/broker.json does not exist yet, copy a template:

mkdir -p ~/.filescope
cp broker.default.json ~/.filescope/broker.json

Then edit baseURL and model to match the runtime's LLM. Verify with status() — it reports broker connection state.

Default: Linux-native (Hermes on Ubuntu)

This is the deployment FileScopeMCP is primarily designed for: agent runtime and llama-server on the same Ubuntu host, broker talking to localhost.

Typical Hermes setup already has llama-server bound to 127.0.0.1:8880 with model alias llm-model (Hermes uses it as fallback_model in config.yaml). The default broker template (broker.default.json) is configured for exactly this — copy it as-is:

cp broker.default.json ~/.filescope/broker.json

No edits needed if the runtime's llama-server is on localhost:8880 with alias llm-model. To stand up a fresh llama-server, run ./setup-llm.sh for a platform-aware install guide.

Alternative: Remote LLM on the LAN

If the LLM runs on a separate machine reachable over the LAN (dedicated GPU box, etc.), use broker.remote-lan.json and edit the IP:

cp broker.remote-lan.json ~/.filescope/broker.json
# then edit baseURL to "http://<lan-ip>:8880/v1"

Alternative: WSL2 with llama-server on the Windows host

Legacy setup for users running FileScopeMCP inside WSL2 with a GPU-equipped Windows host. WSL2 does not give native GPU access to llama.cpp, so llama-server runs on Windows and the broker bridges across the WSL2 boundary.

cp broker.windows-host.json ~/.filescope/broker.json

broker.windows-host.json uses the literal placeholder wsl-host in baseURL. At broker startup, src/broker/config.ts runs ip route show default | awk '{print \$3}' and rewrites the placeholder to the actual Windows host gateway IP in memory — no manual editing needed in 99% of cases. See docs/llm-setup.md for the full Windows-side setup (firewall rule, llama-server install, model selection).

This path is not the default. If running on a native Linux host (including Hermes on Ubuntu), use broker.default.json instead.

Skill File for Tool Usage

For tool-call patterns and workflows, install the FileScopeMCP skill from skills/filescope-mcp/SKILL.md. It documents every MCP tool with examples and lists common workflows ("what calls this function?", "where should I add this feature?", "what changed since last session?").

Hermes auto-discovers skills under ~/.hermes/skills/<category>/<name>/SKILL.md. Symlink or copy:

ln -s /home/user/dev/FileScopeMCP/skills/filescope-mcp ~/.hermes/skills/filescope-mcp/filescope-mcp

(Adjust the destination to match the host's Hermes skills layout.)

If You Are Working ON This Codebase

Architecture

src/
  mcp-server.ts        — MCP protocol handler, tool registration, stdio transport
  coordinator.ts       — central orchestrator: init, scanning, file events, change classification, cascade, broker lifecycle
  language-config.ts   — per-language extractor dispatch: AST (tree-sitter) for TS/JS, Python, C, C++, Rust; regex for Go, Ruby, Lua, Zig, PHP, C#, Java
  file-watcher.ts      — chokidar wrapper with debounced events and crash auto-restart
  change-detector/     — semantic change classification (AST diff for TS/JS via tree-sitter, LLM-powered diff for everything else)
  cascade/             — BFS staleness propagation across the dependency graph (depth-limited, payload-bounded)
  broker/              — LLM job queue (priority heap with dedup, Unix socket IPC, multi-provider HTTP via the `ai` SDK)
  nexus/               — Fastify web dashboard (localhost:1234), cross-repo aggregation, log SSE
  db/                  — Drizzle ORM over better-sqlite3 (WAL mode); schema, repository, and migrations

The broker/ module must NOT import from nexus/. (Nexus reads broker config, stats, and types to surface LLM activity in the dashboard — the dependency is one-directional.)

Build and Run

./build.sh          # npm install + tsc + register with Claude Code
./run.sh            # launch manually (build.sh generates this)

./build.sh flags relevant to agent installs:

  • --agent — skip Claude Code registration (agent runtimes register their own way via mcp_servers)
  • --skip-register — skip just the registration step
  • --prefix=/path — install into a non-cwd location
  • --check-only — verify prerequisites without installing
  • --no-wsl — suppress the "project is on /mnt/ — use WSL native FS for speed" warning
  • --legacy-npm — pass --legacy-peer-deps to npm install for older registries

LLM Backend

FileScopeMCP uses a local LLM via llama.cpp's llama-server (or any OpenAI-compatible endpoint) for background summarization. The broker connects over HTTP to /v1/chat/completions. Default: port 8880, model alias llm-model.

Setup: ./setup-llm.sh (prints platform-specific instructions including the WSL2+Windows alternative). Status: ./setup-llm.sh --status.

Path Storage (host-portable layout)

The SQLite DB stores paths relative to projectRoot, but the public API (MCP tools, repository.ts exports) deals in absolute paths. Translation happens at a single chokepoint inside repository.ts. This makes a .filescope/ directory survive being rsync'd or moved between hosts — expensive LLM-generated content (summary, concepts, change_impact) is preserved across path changes that would otherwise trigger a wipe-and-rescan.

Invariant: coordinator.init(projectRoot) calls setRepoProjectRoot(projectRoot) once. From that point, repository.ts relativizes inputs and absolutifies outputs at every SQL site. Outside repository.ts, all code dealing with paths uses absolute form.

Bypass escape hatch: two call sites issue raw SQL against path columns instead of going through repository.ts and import toStoredPath from db/repository.js to translate at the boundary: cascade/cascade-engine.ts markSelfStale and mcp-server.ts get_file_summary LLM-data lookup. Keep this surface tiny.

nexus/repo-store.ts does not need a translation helper because the DB stores paths in the same relative form the dashboard already queries with — its raw SQL is intentional and benign.

Format guard: coordinator.init() writes / verifies kv_state['paths_format'] = 'relative_v1' on first connection. If a future format bump renders an old DB incompatible, the guard throws with a clear recovery hint rather than corrupting silently. To recover from a flagged-mismatch error, delete .filescope/ and reinit.

Key Conventions

  • TypeScript strict mode, ES modules
  • All tools return structured JSON with consistent error codes (NOT_INITIALIZED, NOT_FOUND, BROKER_DISCONNECTED)
  • Tool descriptions are the contract — they are what agents read to decide when to call a tool. Write them precisely.
  • Tests: npm test
  • Database: SQLite in .filescope/ within the target project directory; paths stored relative to projectRoot (see Path Storage)