2. If you use Claude Code, install hooks once
June 15, 2026 · View on GitHub
██╗ ██╗ ██║ ██║ ███████║ Hawkeye ██╔══██║ The flight recorder for AI agents ██║ ██║ ╚═╝ ╚═╝
Hawkeye
The flight recorder for AI agents
Open-source observability, guardrails, and post-mortems for Claude Code, Codex, Cline, and custom agent CLIs.
Why • Quick Start • What You Get • Workflows • Dashboard • CLI • Config • Development
Why try Hawkeye?
Most agent tooling helps you launch an agent. Hawkeye helps you answer what happened after it started drifting, overspending, touching the wrong files, or failing in a way nobody can explain.
Use it when you want to:
- record exactly what an agent did across terminal, files, network, and LLM calls
- replay a bad run instead of guessing
- detect objective drift before a session goes off the rails
- enforce guardrails around files, commands, directories, cost, and review gates
- compare runs, inspect memory, and generate useful post-mortems
- monitor live tasks, spawned agents, and multi-agent work in one place
Hawkeye is especially useful once your workflow stops being "one CLI prompt in one terminal" and becomes "multiple agents, long-running sessions, real cost, real risk".
Quick Start
Install
Requires Node.js 20+.
npm install -g hawkeye-ai
Or run without installing:
npx hawkeye-ai
Homebrew is also supported:
brew install MLaminekane/hawkeye/hawkeye-ai
If an old install fails with workspace:* or a native SQLite error, reinstall hawkeye-ai@latest.
First 5 minutes
# 1. Initialize Hawkeye in your repo
hawkeye init
# 2. If you use Claude Code, install hooks once
hawkeye hooks install
# 3. Launch the interactive TUI
hawkeye
# 4. Or record a session directly
hawkeye record -o "review the auth flow" -- codex
# 5. Open the dashboard
hawkeye serve
Then try one of these:
- Start a
Claude Codesession and let the hooks auto-record it. - Launch a
CodexorClinerun withhawkeye record. - Open the dashboard and inspect
Sessions,Compare,Firewall, andTasks.
What you get
After one recorded run, Hawkeye gives you:
- a session timeline with commands, file reads/writes, LLM requests, tokens, and cost
- a replayable history with drift score and risk signals
- file-level and cost-level insight into what changed
- analysis tools for "why did this fail?" instead of "I think it probably..."
- a dashboard you can actually use while an agent is running, not just after the fact
The product idea is simple:
If an agent touched your repo, spent money, or got weird, you should be able to inspect it like a real system.
Core Workflows
1. Record any agent CLI
Wrap a command with hawkeye record:
hawkeye record -o "refactor session detail page" -- codex
hawkeye record -o "audit the settings page" -- cline
hawkeye record -o "review this repo" -- my-custom-agent --arg value
Hawkeye records:
- terminal commands and exit codes
- file operations
- network and LLM activity
- tokens and cost when available
- session metadata and timing
2. Use Claude Code with hooks
Claude Code works best with Hawkeye through hooks:
hawkeye hooks install
hawkeye
From the TUI:
- use
/new - choose
Claude Code - enter an objective
- run
claudein your terminal
Hawkeye will link the Claude session automatically and record actions through the installed hooks.
Useful commands:
hawkeye hooks install
hawkeye hooks install --guardrails-only
hawkeye hooks status
hawkeye hooks uninstall
3. Monitor tasks and agents
Hawkeye is not just a recorder. It can also drive work:
Tasksfor prompt submission and remote executionAgentsfor live spawned agents, follow-ups, relaunch, and cost trackingSwarmfor multi-agent orchestration and coordination
This is where Hawkeye starts feeling less like "logging" and more like an operations layer for agent work.
4. Catch drift and enforce guardrails
Hawkeye can score how aligned a run still is with its goal and stop or warn when it starts going wrong.
Built-in guardrail categories include:
- file protection
- command blocking
- directory scope
- cost limits
- token limits
- network lock
- review gates
You can manage guardrails from the dashboard or from policy files.
5. Compare, replay, analyze
The most valuable moment often comes after a bad run:
Compareshows multiple sessions side by sideReplayhelps inspect what happened step by stepAnalyzegives a root-cause style summaryMemoryshows what the agent retained or hallucinated across runs
Dashboard
Launch it with:
hawkeye serve
Default URL:
http://localhost:4242
Main pages:
Sessions- browse recent runs, inspect durations, costs, drift, and current activitySession Detail- deep timeline, changed files, cost breakdown, replay, exportCompare- compare runs visually across cost, actions, tokens, duration, driftFirewall- watch live actions, blocks, reviews, and impact previewsTasks- queue prompts, retry failures, stream output, and monitor daemon workAgents- spawn and steer agents live, review outputs, relaunch failuresSwarm- coordinate multi-agent work and see dependencies/conflictsMemory- inspect what an agent appears to remember across sessionsSettings- configure providers, keys, guardrails, webhooks, autocorrect, and local runtimes
CLI Essentials
Run hawkeye with no subcommand to open the interactive TUI.
The TUI includes a slash-command picker with arrow navigation and search.
Useful commands:
| Command | What it does |
|---|---|
hawkeye | Open the interactive TUI |
hawkeye init | Initialize .hawkeye/ in the current repo |
hawkeye record -o "..." -- <command> | Record a new run around any agent command |
hawkeye serve | Start the dashboard |
hawkeye daemon | Run the task daemon |
hawkeye hooks install | Install Claude Code hooks |
hawkeye analyze <session-id> | Generate a root-cause style analysis |
hawkeye replay <session-id> | Replay a session |
hawkeye compare <id1> <id2> | Compare runs |
hawkeye report | Generate a morning report |
hawkeye ci --pr 42 | Post a session report to a GitHub PR |
hawkeye mcp | Start the MCP server |
Inside the TUI, the most useful slash commands are:
/new/attach/sessions/inspect/compare/firewall/tasks/swarm/settings/watch
Agent Support
Hawkeye currently fits best with:
Claude Codevia hooksCodexCline- any custom command you want to wrap with
hawkeye record
The product is designed to stay useful even when the underlying agent changes. The point is observability and control, not lock-in to a single runtime.
Tasks, Agents, and Swarm
Tasks
Queue a prompt and let the daemon execute it:
hawkeye daemon
hawkeye serve
Then submit tasks from the dashboard.
Good for:
- running prompts from your phone
- retrying failed jobs
- reviewing output after completion
- monitoring long-running work from one place
Agents
Spawn a live agent from the dashboard and keep control over:
- role
- runtime
- permissions
- cost
- drift
- follow-up instructions
Swarm
Swarm is for multi-agent work where a single run is not enough.
Typical use cases:
- parallelize a big refactor
- split review vs implementation
- isolate work in separate worktrees
- coordinate merge order and detect conflicts early
DriftDetect
DriftDetect scores whether an agent still looks aligned with its stated objective.
It combines:
- local heuristics
- provider/model-based scoring
- configurable thresholds
- optional auto-pause behavior
You can configure:
- provider
- model
- check frequency
- context window
- warning threshold
- critical threshold
- auto-pause
Local backends are supported:
OllamaLM Studio
Guardrails and Policy
Hawkeye supports both dashboard editing and policy-file workflows.
You can manage rules for:
- protected files
- dangerous commands
- cost ceilings
- token ceilings
- directory scope
- network restrictions
- review gates
Example policy flow:
hawkeye policy init
hawkeye policy show
hawkeye policy check
MCP Server
Start the MCP server over stdio:
hawkeye mcp
This lets MCP-aware agents query Hawkeye for session awareness, memory, and operational context.
Useful when you want agents to become aware of:
- current session state
- correction hints
- memory snapshots
- previous failures
- drift or guardrail signals
CI and Reports
Hawkeye can report AI-generated work back to GitHub PRs.
Example:
hawkeye ci --pr 42
It can post:
- a Check Run
- a PR comment
- risk, drift, cost, and session summaries
- replay links back to the dashboard
There is also a reusable GitHub Action in this repo for CI workflows.
Configuration
Hawkeye stores project config under:
.hawkeye/config.json
Main config areas:
driftguardrailsapiKeyswebhooksautocorrectrecordingdashboard
You can configure local providers for Ollama and LM Studio, plus API-backed providers like OpenAI, Anthropic, and DeepSeek.
Architecture
This repo is a monorepo:
packages/
├── cli/ CLI, daemon, server, hooks, MCP integration
├── core/ recorder, interceptors, drift engine, storage
└── dashboard/ React dashboard
High-level flow:
Agent command
-> Hawkeye recorder / hooks
-> interceptors + storage
-> drift + guardrails
-> dashboard / replay / compare / reports
Development
From source:
git clone https://github.com/MLaminekane/hawkeye.git
cd hawkeye
pnpm install
pnpm build
Useful commands:
pnpm dev
pnpm build
pnpm test
pnpm --filter hawkeye-ai build
pnpm --filter @hawkeye/dashboard build
Requirements:
- Node.js 20+
- pnpm
Why Hawkeye feels different
A lot of agent tooling stops at "launch a model and hope for the best."
Hawkeye is more opinionated:
- it assumes agent work should be inspectable
- it treats cost and drift as first-class signals
- it gives you guardrails before damage, not just logs after damage
- it works across CLI, dashboard, tasks, agents, and swarm instead of leaving those as separate tools
If you are already serious enough about agents to care about reliability, cost, and auditability, Hawkeye is worth trying.
License
MIT