πŸ“Ÿ agentwatch

May 26, 2026 Β· View on GitHub

πŸ“Ÿ agentwatch

Five AI agents on one machine, and no idea what any of them just did.

agentwatch is one local timeline for every coding agent you run β€” Claude Code, Codex, Gemini CLI, Cursor, Hermes, OpenClaw. What they ran, what it cost, and when they went off the rails. All local: no cloud, no telemetry, no sign-in.

npm CI License: MIT Node >=20 MCP server

Install β€’ First 60 seconds β€’ Features β€’ MCP β€’ Compare


πŸ€– Reading this as an AI agent? Go straight to AGENTS.md β€” it self-onboards you in three steps (install β†’ verify β†’ run), no account needed.

agentwatch β€” doctor detects your agents, then a live multi-agent timeline in the TUI

The TUI is the live tail; the web UI is where you drill in β€” projects, sessions, token charts, compaction sparklines, call graphs, diff attribution, replay, anomaly triage. Both run in one process. Press w in the TUI to open the browser.

agentwatch web UI β€” unified timeline of AI-agent events with per-event risk, type, and session

Table of contents


Why this exists

You run three AI coding agents on one laptop. Claude Code in a terminal, Codex alongside it, Cursor as your IDE, maybe Gemini CLI for a quick review, maybe an OpenClaw sub-agent churning on a long task. Every one of them has its own log file, its own permission model, its own idea of what a "session" is. None of them tells you what the others are doing.

When something goes wrong β€” a file rewritten unexpectedly, a spend spike, an rm you don't remember running β€” you're piecing it together from five JSONLs and guessing.

claude-devtools does this well for Claude Code. agentwatch does it for the whole multi-agent stack, in the terminal, with zero infrastructure and zero network.


Why this over claude-devtools if you run multiple agents?

Short, factual diff. claude-devtools is a great tool for Claude-only workflows β€” if you only use Claude Code, it's probably the better pick. agentwatch is the answer when you run more than one agent on the same machine and want one timeline + one cost ledger + one alerting surface across all of them.

Whatclaude-devtoolsagentwatch
Claude Code coverageβœ… fullβœ… full
Codex coverageβŒβœ… tokens + tools + cost + compaction
Gemini CLI coverageβŒβœ… tokens + tools + cost
OpenClaw coverageβŒβœ… tokens + cost
Hermes Agent coverageβŒβœ… tokens + tools + cost (SQLite)
Cursor coverage❌🟑 config level
Per-agent budget alarmsβŒβœ… session + daily caps
Statistical anomaly detection (loops / spikes)rule-based onlyβœ… MAD z-score + period-1-to-4 loops
OpenTelemetry exporter (gen_ai.*)βŒβœ… Jaeger / Tempo / Grafana ready
MCP server β€” agents query their own historyβŒβœ… 5 tools over stdio
User-defined regex/threshold triggersβŒβœ… live-reloaded
InstallHomebrew / Electron ~150 MBnpm i -g Β· 220 KB Β· TUI
Data boundarylocallocal

If "every agent on one pane of glass + programmatic access via MCP + pipeline-friendly OTel" matches your setup, agentwatch is the tool. If you're Claude-only and want the Electron polish, claude-devtools is still excellent.


Install

npm i -g @misha_misha/agentwatch
agentwatch

Requires:

  • Node β‰₯ 20 (tested on 20 + 22 in CI)
  • macOS or Linux (Windows intentionally out of scope for v0.x)

Published under the @misha_misha npm scope β€” the unscoped agentwatch name was already taken by a CyberArk tool. The installed binary on your $PATH is simply agentwatch.


First 60 seconds

agentwatch doctor   # detects installed agents + readiness
agentwatch          # TUI live-tail + web UI at http://127.0.0.1:3456
agentwatch serve    # web UI only (remote boxes / server cron)
agentwatch mcp      # runs the MCP stdio server (for agents, not humans)
agentwatch --help

Flags:

  • --no-web β€” TUI only, don't start the web server
  • --port <n> / --host <addr> β€” override web server bind
  • AGENTWATCH_PORT=… AGENTWATCH_HOST=… β€” env equivalents

doctor output looks like:

workspace: /Users/you/IdeaProjects

agents:
  ● Claude Code        installed (events captured)
  ● Codex              installed (events captured)
  ● Gemini CLI         installed (events captured)
  ● Hermes Agent       installed (events captured)
  ● Cursor             installed (config-level only)
  ● OpenClaw           installed (events captured)
  β—‹ Aider              not detected
  β—‹ Cline (VS Code)    not detected

Launch agentwatch and every event your agents emit streams in. The TUI shows a live tail; the web UI at http://127.0.0.1:3456 is where you drill in β€” projects, sessions, token charts, SVG call graphs, diff attribution, prompt replay, trends. Press w in the TUI to open it.

Web UI map

RouteWhat it is
/Live timeline (SSE-streamed) with agent + type filters
/projectsGrid of detected projects + cost + session counts
/projects/:nameSessions table for one project
/sessions/:idChronological event list Β· export .md / .json
/sessions/:id/tokensStacked-area token chart per turn
/sessions/:id/compactionContext fill % over time + compaction markers
/sessions/:id/graphCall graph (d3-hierarchy SVG) β€” click nodes to drill
/sessions/:id/diffsWrites paired with the prompt that triggered them
/sessions/:id/replayEdit prompt β†’ re-run the agent in single-turn exec
/searchUnified search (live / cross / semantic)
/agentsGrid of every supported agent + install status
/permissionsPer-agent permission config
/cronOpenClaw cron jobs + heartbeats
/trendsCost, cache-hit ratio, events per agent (30d default)
/settings/{budgets,anomaly,triggers}Form editors for ~/.agentwatch/*.json

⌘K / Ctrl+K opens the command palette. / focuses the timeline filter.


Agent coverage

What actually works per agent, as of v0.0.3. Features not listed here work across every agent (timeline, export, syntax highlighting, notifications, triggers, search, stale detection, clipboard yank).

FeatureClaude CodeCodexGemini CLICursorOpenClawHermes
Live events on timelineβœ…βœ…βœ…πŸŸ‘βœ…βœ…
Token usage + costβœ…βœ…βœ…βŒβœ…βœ…
Tool call + result pairingβœ…βœ…βœ…βŒπŸŸ‘βœ…
Per-turn token attributionβœ…βœ…βœ…βŒβœ…βœ…
Budget alarms (session + day)βœ…βœ…βœ…βŒβœ…βœ…
Anomaly detection (cost/loops)βœ…βœ…βœ…πŸŸ‘βœ…βœ…
Compaction visualizerβœ…βœ…βŒβ€”βŒβŒ
Permissions viewβœ…βœ…βœ…βœ…βœ…β€”
Cross-session searchβœ…βœ…βœ…βŒβŒπŸŸ‘
Subagent drilldownβœ…β€”πŸŸ‘β€”πŸŸ‘πŸŸ‘
Replay (agent-aware exec)βœ…βœ…βœ…βŒβŒβœ…
Agent memory file overheadCLAUDE.mdAGENTS.mdGEMINI.md.cursorrulesOPENCLAW.mdSOUL.md
OTel span coverageβœ…βœ…βœ…πŸŸ‘βœ…πŸŸ‘
MCP server exposes historyβœ…βœ…βœ… (raw)❌❌❌
  • Cursor exposes config state (MCP servers, .cursorrules, approval mode, sandbox) but its actual AI activity lives in a SQLite database we haven't parsed yet. A thin read-only adapter is a follow-up.
  • Gemini CLI doesn't persist context-compaction markers to disk, so compaction detection is Claude + Codex only.
  • OpenClaw doesn't persist tool_result content or compaction markers to its JSONL β€” structural limit of what's on disk, not an adapter gap.
  • Hermes Agent (by Nous Research β€” the OpenClaw successor with a closed learning loop) persists sessions to ~/.hermes/state.db (SQLite + FTS5). The adapter polls the DB over chokidar + 2s safety-net and emits the full session/prompt/response/tool-call stream. Replay re-runs single turns via hermes chat -q <prompt> -Q --max-turns 1.

Features

Live multi-agent timeline

Main screen. Every event your agents emit, ordered by event timestamp (not arrival order, so backfill from different sessions merges correctly). Columns: time Β· agent Β· type Β· [project] summary Β· duration Β· error.

09:54:01  openclaw     response       [content_agent] <think> Checked the KB…
09:52:53  claude-code  response       [example] Commit bddc363. q now exits instantly…
09:52:48  codex        shell_exec     [dataset_research] ls -la Β· 12ms
09:52:43  claude-code  tool_call      [example] Edit: src/ui/App.tsx Β· 7ms
09:51:51  gemini       file_write     [landing] write_file: public/llms.txt
09:51:51  claude-code  tool_call      [example] Agent: Competitive landscape β–Έ 52 child events

Rows with an anomaly fire a red β—Ž prefix on the type column.

Event detail pane

Press Enter on any row. Opens a full-screen pane with:

  • Metadata (time, agent, type, tool, path, cmd)
  • Tokens / cost / duration (in=6 cache_create=25508 cache_read=16827 out=353 Β· $0.08 (claude-opus-4-6) Β· 151ms)
  • Tool result β€” stdout for Bash, file content for Read/Write, search matches for Grep β€” with syntax highlighting inferred from the tool + file extension
  • Full prompt or response text
  • Extended thinking block when present
  • Tool input JSON

Scrollable with ↑↓ or j/k. esc closes.

Subagent drilldown

Parent Agent tool_use events show β–Έ 52 child events. Press x to scope the timeline to only that subagent's inner tool calls. X unscopes. Applies to Claude Code (Task tool) and partially to OpenClaw (per-agent delegation) and Gemini (subagent sessions).

Project + session navigation

P β†’ projects grid (one workspace per row, across all agents)
     ↓ enter β†’ sessions list (grouped Today / Yesterday / 7d / Older)
             ↓ enter β†’ scoped timeline

Projects grid aggregates across agents: per-agent session counts, total cost, last activity. esc walks back one level.

Press ? β€” fuzzy-substring search across every session file on disk (~/.claude, ~/.codex, ~/.gemini). Uses ripgrep if installed, falls back to a native scan. Enter on a hit scopes the timeline to that session.

Different from in-buffer search:

  • / β€” search the 500-event live buffer
  • `?$ β€” \text{search} \text{every} \text{session} \text{file} \text{ever} \text{written}

\text{Per}-\text{session} \text{cost} \text{with} \text{cache} \text{accounting}

\text{Naive} \text{token} \text{counters} \text{are} 3–10 \times \text{wrong} \text{on} \text{Claude} \text{because} $cache_readis billed at 10% of input andcache_creation` at 125%. agentwatch ships a per-model rate table (Claude opus/sonnet/haiku, GPT-5 / GPT-5-mini, Gemini 2.5 Pro/Flash) and computes true USD cost per turn. Cost shows:

  • Per-agent total in the side panel
  • Per-event in the detail pane
  • Per-session in the sessions list
  • Aggregate in the session's token attribution view ([t])

Per-turn token attribution ([t])

Inside a scoped session, press t. Stacked bar per turn showing:

  • user β€” the preceding prompt (tokenized with gpt-tokenizer)
  • memory file β€” CLAUDE.md / AGENTS.md / GEMINI.md / .cursorrules / etc., read from the session's cwd
  • tool I/O β€” tool_input JSON + tool_result text
  • thinking β€” extended thinking block
  • input (fresh) / cache read / cache create / output β€” exact from the model's own usage record

Compaction visualizer ([C])

Inside a scoped session, press C. Horizontal bar of context fill % across turns, with β‹ˆ markers where the agent auto-compacted. Selected compaction shows before / after token counts and the dropped-token delta. Works on Claude Code (via isCompactSummary) and Codex (via event_msg/turn_truncated).

Budget alarms

~/.agentwatch/budgets.json:

{ "perSessionUsd": 5, "perDayUsd": 20 }

Red banner in the Header when either cap is crossed; OS notification fires once per crossing. No kill switch β€” we don't control agents; we just shout.

Anomaly detection

Three detectors, all fully local, all running on the 500-event buffer:

  • MAD z-score outliers on cost, duration, and input tokens per agent (|z| > 3.5 by default β€” tune in ~/.agentwatch/anomaly.json)
  • Stuck-loop detector with periods 1–4 β€” catches A-A-A-… and A-B-A-B-… "apologize and retry" loops
  • Per-session rollup + OS notification on first flag + timeline β—Ž marker
    • [D] to dismiss the banner

User-defined notification triggers

~/.agentwatch/triggers.json β€” live-reloaded via chokidar:

[
  { "match": "curl .* \\| (bash|sh)", "title": "pipe-to-shell", "body": "{{agent}}: {{cmd}}" },
  { "type": "file_write", "pathMatch": "^/etc/", "title": "/etc write" },
  { "thresholdUsd": 0.5, "title": "expensive turn", "body": "cost {{cost}}" }
]

Placeholders: {{agent}} {{type}} {{cmd}} {{path}} {{tool}} {{summary}} {{cost}}.

Desktop notifications

Built-in alerts fire on sensitive events β€” .env access, ~/.ssh / ~/.aws / ~/.gnupg paths, rm -rf, sudo, curl | sh, tool errors, budget breach, anomaly. Rate-limited (60s per rule key). Silent during backfill.

Platform dispatch: osascript on macOS, notify-send on Linux, PowerShell MessageBox on Windows. Zero third-party dependencies.

Per-agent permission surface ([p])

Scrollable view showing:

  • Claude Code β€” allow / deny / defaultMode; flagged risks (Bash(*), missing .ssh denies, auto / bypass modes in red)
  • Codex β€” config.toml projects + trust_level; latest session's sandbox_policy, approval_policy, writable_roots, network_access, model
  • Gemini CLI β€” auth type, selected model, tool allow/block lists, trusted folders
  • Cursor β€” approval mode, sandbox state, MCP servers, discovered .cursorrules
  • OpenClaw β€” default workspace + per-sub-agent (name, emoji, model, workspace)

Session export ([e])

From a session list or scoped timeline, press e. Writes ./agentwatch-export/<agent>-<session>-<ts>.md (human-readable transcript with tool calls as fenced blocks) and .json (raw events). Path copied to clipboard.

Syntax highlighting in the detail pane

cli-highlight (tiny ANSI highlighter) applies to:

  • Tool input JSON
  • Tool result when the tool is Bash or the file extension is known (.ts, .py, .rs, .go, etc.)
  • Fenced blocks in user/assistant text

Stale-session detection

Sessions and projects idle for > 5 minutes render dimmed with a ⊘ stale badge. Un-greys on the next event.

Clipboard yank ([y])

Copies the most useful payload (tool result > full text > cmd / path / summary). Uses pbcopy, wl-copy / xclip / xsel, or clip. Confirmation flashes at the footer.


Keyboard reference

Press ? anytime to open this inside the TUI.

KeyAction
↑ ↓ / j kmove selection in the timeline
Enteropen event detail pane
escclose current view / clear selection
Pprojects grid
Enter on projectsessions list for that project
Enter on sessionscoped timeline for that session
q / Ctrl-Cquit

Filter & scope

KeyAction
/in-buffer search (last 500 events)
?cross-session search (every session file on disk)
fcycle agent filter
atoggle agent side panel
xdrill selected Agent event into its subagent run
Xunscope subagent
Aclear project filter
Zclear all filters

Actions

KeyAction
yyank selected event content to clipboard
eexport current session to .md + .json
spacepause / resume live event stream
cclear event buffer
Ddismiss the current anomaly banner

Info overlays (only in a scoped session)

KeyAction
tper-turn token attribution
Ccontext compaction visualizer
ppermissions view (works anywhere)

Configuration

Four config files, all optional. Loaded on startup; triggers reload live.

FilePurpose
~/.agentwatch/triggers.jsonUser-defined notification rules (live-reloaded)
~/.agentwatch/budgets.jsonperSessionUsd / perDayUsd spend caps
~/.agentwatch/anomaly.jsonzScore, loopWindow, loopMinRepeats, minSamples

Environment variables:

VariableDefaultPurpose
WORKSPACE_ROOT~/IdeaProjects (fallback)Where the generic filesystem watcher looks for edits
AGENTWATCH_CONTEXT_WINDOW200000Tokens per window β€” used by compaction % calculation
AGENTWATCH_OTLP_ENDPOINTunsetEnables the OTel exporter when set
NO_COLORunsetStandard honoring: disables ANSI colors if set

Workspace fallback chain (used when WORKSPACE_ROOT isn't set): ~/IdeaProjects β†’ ~/src β†’ ~/code β†’ ~/Projects β†’ ~/dev β†’ $HOME.


What agentwatch reads

Read-only. agentwatch writes to exactly two places: your terminal and the clipboard (on explicit y) / disk (on explicit e to export).

PathWhat
~/.claude/projects/**/*.jsonlClaude Code session transcripts
~/.claude/projects/**/subagents/*.jsonlClaude Code Task-spawned subagents
~/.claude/settings.jsonClaude permissions
~/.codex/sessions/**/rollout-*.jsonlCodex session transcripts
~/.codex/config.tomlCodex permissions + trust levels
~/.gemini/tmp/**/chats/*.jsonGemini CLI transcripts + tool calls
~/.gemini/settings.json + trustedFolders.jsonGemini permissions
~/.openclaw/agents/*/sessions/*.jsonlOpenClaw sub-agent sessions
~/.openclaw/logs/config-audit.jsonl + openclaw.jsonOpenClaw config audit + agent roster
~/.hermes/state.db (SQLite)Hermes Agent sessions + messages
~/.cursor/{mcp.json, cli-config.json, ide_state.json}Cursor config state
Any .cursorrules / .cursor/rules/*.mdc under WORKSPACECursor project rules
{CLAUDE,AGENTS,GEMINI,OPENCLAW}.md + .windsurfrules etc.Per-agent memory files for token attribution
~/.agentwatch/*.jsonUser config (triggers / budgets / anomaly)
$WORKSPACE_ROOT treeFilesystem change events

SECURITY.md carries the authoritative list and details of what is not read.


MCP server mode

Run agentwatch as an MCP server so other agents can query their own history. Install:

claude mcp add agentwatch -- npx -y @misha_misha/agentwatch mcp
# or edit ~/.claude.json / ~/.cursor/mcp.json manually

Tools exposed:

ToolArgsReturns
list_recent_sessionslimit?: 1-100[{agent, sessionId, project, lastActivity, sizeBytes}]
get_session_eventssessionId, maxBytes?: 1K-10MRaw JSONL (tail-capped) for that session
search_sessionsquery, limit?: 1-50[{session, agent, line}] substring hits
get_tool_usage_statssessionId?, limit?: 1-500Per-tool counts, totalDurationMs, errorCount
get_session_costsessionId{totalCostUsd, turns, tokens, byModel}

See docs/features/mcp-server.md.


OpenTelemetry exporter

Set AGENTWATCH_OTLP_ENDPOINT=http://localhost:4318/v1/traces to emit OTLP/HTTP spans for every agent event. Uses the OpenTelemetry GenAI semantic conventions so any consumer (Jaeger, Tempo, Honeycomb, Grafana) can interpret the data without custom dashboards.

Attributes emitted:

  • gen_ai.system (anthropic | openai | google | cursor | …)
  • gen_ai.operation.name (chat | tool_use | context_compaction | …)
  • gen_ai.request.model / gen_ai.response.model
  • gen_ai.usage.input_tokens / gen_ai.usage.output_tokens
  • gen_ai.tool.name / gen_ai.tool.call.id
  • error.type on tool errors
  • agentwatch.session.id / agentwatch.cost_usd
  • agentwatch.cache_read_tokens / agentwatch.cache_create_tokens / agentwatch.cache_hit_ratio
  • agentwatch.context.fill_pct
  • agentwatch.risk_score

OTel deps are loaded dynamically only when the env var is set β€” zero runtime cost when disabled.


How it compares

agentwatchclaude-devtoolsClaudexccflareLangfuse / Phoenix
Runs locally onlyβœ…βœ…βœ…βœ…self-host possible
Multi-agentβœ… Claude, Codex, Gemini, Cursor (config), OpenClawClaude onlyClaude onlyClaude onlyproduction LLM apps
Real token + cost with cacheβœ…βœ…πŸŸ‘βœ… (proxy-level)βœ…
Per-turn token attributionβœ…βœ…βŒβŒβŒ
Compaction visualizerβœ…βœ…βŒβŒβŒ
Anomaly detectionβœ… MAD + stuck-looprule-based only❌❌❌
Budget alarms w/ OS notificationβœ…βŒβŒβŒβŒ
User triggers (regex/threshold)βœ… live-reload❌❌❌❌
OTel exporter (gen_ai.*)βœ…βŒβŒβŒβœ… (its own format)
MCP server (self-query)βœ…βŒβœ…βŒβŒ
Permission surface viewβœ… 5 agents❌❌❌❌
Subagent drilldownβœ…βœ…βŒβŒβœ… (LangChain-specific)
Installnpm i -gHomebrew / Electronnpm i -gBun repoDocker + Postgres
UITUI (Ink)Electron + standaloneWeb UIWeb + TUIWeb
Telemetrynonenonenonenoneopt-in

Three moats are genuinely unique: anomaly detection (statistical, not rule-based), budget alarms, and OTel with gen_ai. conventions*.


Limitations

  • agentwatch is a viewer, not a daemon. It captures events only while the TUI is running. A background-capture daemon is planned.
  • Backfill is bounded. On launch we read the last ~4 MB of each active session file (roughly hundreds of events). For long gaps on very active sessions, earliest events may fall out of the backfill window. Keep agentwatch open in a tmux pane for zero gaps.
  • Cursor activity is config-level only. Cursor's AI activity lives in a SQLite database we don't parse yet. We capture config changes + .cursorrules + MCP servers + .cursor/rules/*.mdc. Full activity parsing is a follow-up.
  • Gemini and OpenClaw have data-structure gaps. Gemini CLI doesn't persist compaction markers to disk. OpenClaw doesn't persist tool_result content or compaction markers. Not fixable from our side.
  • Windsurf, Aider, Cline are detected but not instrumented yet.
  • macOS and Linux only. Windows needs more chokidar + notifier testing before we promise it.
  • tokenizer is cl100k_base (gpt-tokenizer), which is ~5% off for Claude. Exact tokens for input / cache / output come from the model's own usage record; the ~5% approximation only affects the user / thinking / tool I/O / memory-file categories in the attribution view.

Non-goals

Hard scope boundaries so agentwatch stays small and maintainable.

  • Not cloud. Not SaaS. Not ever.
  • Not an agent itself. It watches agents; it doesn't take actions.
  • Not production LLM-app tracing. Langfuse owns that.
  • Not enterprise compliance. Anthropic's Compliance API covers that.
  • Not orchestration. Use Mission Control / Stoneforge for running agents in parallel.
  • Not memory. Use claude-mem.
  • Not governance / policy enforcement. Use DashClaw / Castra.

Architecture

TypeScript monorepo. Three-layer mental model:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  TUI layer  (ink / React)                                   β”‚
β”‚    Timeline Β· EventDetail Β· Permissions Β· Projects          β”‚
β”‚    Sessions Β· Tokens Β· Compaction Β· CrossSearch Β· Header    β”‚
β”‚                                                             β”‚
β”‚  MCP server  (stdio β€” programmatic, not a UI)               β”‚
β”‚    list_recent_sessions Β· get_session_events                β”‚
β”‚    search_sessions Β· get_tool_usage_stats Β· get_session_cost β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚  EventSink.emit / enrich
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Adapter layer  (one per agent)                             β”‚
β”‚    claude-code Β· codex Β· gemini Β· cursor Β· openclaw Β· hermes β”‚
β”‚    fs-watcher (generic)                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚  files read-only
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  OS  (log files, config files, clipboard, notifier)         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Adapters read files, translate raw log lines into canonical AgentEvents, emit through an EventSink.
  • EventSink.enrich(id, patch) lets an adapter update a previously-emitted event (e.g. when a tool_result arrives late and needs to attach duration + output to the original tool_use).
  • The TUI is a pure reducer over the event buffer. Filtering, search, scope are derived views β€” no mutation.
  • The MCP server is a peer of the TUI: it reads the same session files on demand, via its own scan (no shared in-memory state with the TUI). This is a known duplication; see Linear for the refactor ticket.

See src/schema.ts for the canonical event shape.


Development

git clone https://github.com/mishanefedov/agentwatch.git
cd agentwatch
npm install
npm run dev           # launch the TUI directly from source (tsx)
npm test              # vitest β€” 97 tests
npm run typecheck     # strict TypeScript
npm run build         # tsup β†’ dist/

See CONTRIBUTING.md for the contribution workflow.

Docs

  • docs/features/ β€” feature specs (scope, inputs, outputs, failure modes). Being extended feature-by-feature.
  • docs/testing/ β€” manual test procedures + a pre-release walkthrough.
  • docs/use-cases/ β€” multi-agent triage, cost-overrun investigation, security audit, stuck-loop detection, subagent post-mortem, .env leak alert.

Security

Local-first is a hard invariant.

  • Zero network calls unless you explicitly set AGENTWATCH_OTLP_ENDPOINT (to a host you chose, OTel output only).
  • Zero telemetry. Not opt-in, not opt-out β€” simply not there.
  • All files read-only except the clipboard (on y) and ./agentwatch-export/ (on e).
  • Every path agentwatch reads is documented in SECURITY.md.

Report vulnerabilities privately via a Security Advisory.


License

MIT Β© Misha Nefedov. See LICENSE.


If agentwatch saves you a debugging hour, a ⭐ on the repo makes the effort worth it.