agentsh

May 26, 2026 · View on GitHub

macOS note: Native macOS enforcement via ESF (Endpoint Security Framework) + NE (Network Extension) is in Alpha. It works end-to-end — file, process, and network events flow through the system extension to the Go policy engine — but expect rough edges and breaking changes between releases. For production use today, we recommend Linux.

Windows note: We are working to get the minifilter drivers signed. Until then, only Windows WSL2 mode is fully supported for production use.

Secure, policy-enforced execution gateway for AI agents.

agentsh sits under your agent/tooling—intercepting file, network, process, and signal activity (including subprocess trees), enforcing the policy you define, and emitting structured audit events.

Platform note: Linux provides full enforcement (100% security score). macOS ESF+NE (90% score) is in Alpha — functional but not production-ready. Windows WSL2 provides full Linux-equivalent enforcement (100% score); native Windows via minifilter driver + AppContainer (85% score) is pending driver signing. See the Platform Comparison Matrix for details.


What is agentsh?

  • Drop-in shell/exec endpoint that turns every command (and its subprocesses) into auditable events.
  • Per-operation policy engine: allow, deny, approve (human OK), soft_delete, or redirect.
  • Full I/O visibility:
    • file open/read/write/delete
    • network connect + DNS
    • process start/exit
    • PTY activity
    • LLM API requests with DLP and usage tracking
    • signal send/block (Linux enforced, macOS/Windows audit)
    • database queries via the embedded PostgreSQL proxy — per-statement classification and policy
    • Route outbound HTTP API calls through declared services (http_services) with per-method, per-path rules, approval gating, and fail-closed host enforcement
  • Two output modes:
    • human-friendly shell output
    • compact JSON responses for agents/tools

Why agentsh?

Agent workflows eventually run arbitrary code (pip install, make test, python script.py). Traditional "ask for approval before running a command" controls stop at the tool boundary and can't see what happens inside that command.

agentsh enforces policy at runtime, so hidden work done by subprocesses is still governed, logged, and (when required) approved.


Meaningful blocks: deny → redirect (the "steering" superpower)

Most systems can deny an action. agentsh can also redirect it.

That means when an agent tries the wrong approach (or brute-force workarounds), policy can steer it to the right path by swapping the command and returning guidance—keeping the agent on the paved road and reducing wasted retries.

Example: redirect curl to an audited wrapper

command_rules:
  - name: redirect-curl
    commands: [curl, wget]
    decision: redirect
    message: "Downloads routed through audited fetch"
    redirect_to:
      command: agentsh-fetch
      args: ["--audit"]

Example: redirect writes outside workspace back inside

file_rules:
  - name: redirect-outside-writes
    paths: ["/home/**", "/tmp/**"]
    operations: [write, create]
    decision: redirect
    redirect_to: "/workspace/.scratch"
    message: "Writes outside workspace redirected to /workspace/.scratch"

The agent sees a successful operation (not an error), but you control where things actually land.


Containers + agentsh: better together

Containers isolate the host surface; agentsh adds in-container runtime visibility and policy.

  • Per-operation audit (files, network, commands) shows what happened during installs/builds/tests.
  • Approvals and rules persist across long-lived shells and subprocess trees—not just the first command.
  • Path-level controls on mounted workspaces/caches/creds; containers don't natively give that granularity.
  • Same behavior on host and in containers, so CI and local dev see the same policy outcomes.

Quick start

Install

macOS (Homebrew)

brew tap canyonroad/tap
brew install --cask agentsh

This installs the AgentSH app bundle with the ESF+NE system extension. After installation you'll be prompted to approve the system extension in System Settings > General > Login Items & Extensions.

Linux (from a GitHub Release)

Download the .deb, .rpm, or .apk for your platform from the releases page.

# Example for Debian/Ubuntu
sudo dpkg -i agentsh_<VERSION>_linux_amd64.deb

From source (Linux)

make build
sudo install -m 0755 bin/agentsh bin/agentsh-shell-shim /usr/local/bin

From source (macOS)

# ESF+NE mode (full enforcement — Alpha, requires Xcode 15+)
make build-macos-enterprise

See macOS Build Guide for detailed macOS build instructions.


Run locally

# Start the server (optional if using autostart)
./bin/agentsh server --config configs/server-config.yaml

# Create a session and run a command (shell output)
SID=$(./bin/agentsh session create --workspace . --json | jq -r .id)
./bin/agentsh exec "$SID" -- ls -la

# Structured output for agents
./bin/agentsh exec --output json --events summary "$SID" -- curl https://example.com

Check what gets enforced

agentsh detect probes the host and reports which enforcement primitives are actually available — seccomp, Landlock, FUSE, eBPF, ptrace, cgroups — grouped into per-domain protection scores plus the selected security mode. On restricted hosts (Daytona, E2B, Firecracker-class) where the seccomp user-notify listener can't install, it reports the mode that will actually enforce rather than what the kernel merely supports.

agentsh detect              # human-readable protection report
agentsh detect config       # emit a config tuned for this host

See Security Modes for the mode matrix and tuning knobs.


Tell your agent to use it (AGENTS.md / CLAUDE.md snippet)

## Shell access

- Run commands via agentsh, not directly in bash/zsh.
- Use: `agentsh exec $SID -- <your-command-here>`
- For structured output: `agentsh exec --output json --events summary $SID -- <your-command-here>`
- Get session ID first: `SID=$(agentsh session create --workspace . --json | jq -r .id)`

Autostart (no manual daemon step)

You do not need to start agentsh server yourself.

  • The first agentsh exec (or any shimmed /bin/sh//bin/bash) will automatically launch a local server using configs/server-config.yaml (or AGENTSH_CONFIG if set).
  • That server keeps the FUSE layer and policy engine alive for the session lifetime; subsequent commands reuse it.
  • Set AGENTSH_NO_AUTO=1 if you want to manage the server lifecycle manually.

Use in Docker (with the shell shim)

See Dockerfile.example for a minimal Debian-based image.

Inside the image, install a release package (or copy your build), then activate the shim:

agentsh shim install-shell \
  --root / \
  --shim /usr/bin/agentsh-shell-shim \
  --bash \
  --i-understand-this-modifies-the-host

Point the shim at your server (sidecar or host):

ENV AGENTSH_SERVER=http://127.0.0.1:18080

Now any /bin/sh -c ... or /bin/bash -lc ... in the container routes through agentsh.

Non-interactive enforcement

By default, the shim bypasses policy when stdin is not a TTY (preserving binary data for piped commands). On platforms where commands are always non-interactive but still need enforcement (e.g., exe.dev, sandbox APIs), add --force:

agentsh shim install-shell \
  --root / \
  --shim /usr/bin/agentsh-shell-shim \
  --bash \
  --force \
  --i-understand-this-modifies-the-host

This writes /etc/agentsh/shim.conf with force=true, which the shim reads at startup. The config file works regardless of how the shell is spawned (unlike env vars or profile scripts). AGENTSH_SHIM_FORCE=1 in the process environment achieves the same effect per-process.

Recommended pattern: run agentsh as a sidecar (or PID 1) in the same pod/service and share a workspace volume; the shim ensures every shell hop stays under policy.


Policy model

Decisions

  • allow
  • deny
  • approve (human OK)
  • redirect (swap a command)
  • audit (allow + log)
  • soft_delete (quarantine deletes with restore)

Scopes

  • file operations
  • commands
  • environment vars
  • network (DNS/connect)
  • database (SQL statements via the PostgreSQL proxy)
  • PTY/session settings
  • declared HTTP services

Evaluation

  • first matching rule wins

Rules live in a named policy; sessions choose a policy.

Defaults:

  • sample config: configs/server-config.yaml
  • default policy: configs/policies/default.yaml
  • env override: set AGENTSH_POLICY_NAME to an allowed policy name (no suffix). If unset/invalid/disallowed, the default is used.
  • env policy: configure policies.env_policy (allow/deny, max_bytes, max_keys, block_iteration) and per-command env_* overrides in policy files. Empty allowlist defaults to minimal PATH/LANG/TERM/HOME with built-in secret deny list; set block_iteration to hide env iteration (requires env shim).
  • allowlist: configure policies.allowed in config.yml; empty means only the default is permitted.
  • optional integrity: set policies.manifest_path to a SHA256 manifest to verify policy files at load time.

Environment policy quick reference

  • Defaults: With no env_allow, agentsh builds a minimal env (PATH/LANG/TERM/HOME) and strips built-in secret keys.
  • Overrides: Per-command env_allow/env_deny plus env_max_keys/env_max_bytes cap and filter the child env at exec time.
  • Block iteration: env_block_iteration: true (global or per rule) hides env enumeration; set policies.env_shim_path to libenvshim.so so agentsh injects LD_PRELOAD + AGENTSH_ENV_BLOCK_ITERATION=1.
  • Limits: Errors if limits are exceeded; env builder is applied before exec for every command.
  • env_inject: Operator-trusted environment variables injected into all commands, bypassing policy filtering. Primary use: BASH_ENV to disable shell builtins that bypass seccomp. Configure in sandbox.env_inject (global) or policy-level env_inject (overrides global).
  • Examples: See config.yml and policy samples under configs/.

Example rules (trimmed)

version: 1
name: default

file_rules:
  - name: allow-workspace
    paths: ["/workspace", "/workspace/**"]
    operations: [read, open, stat, list, write, create, mkdir, chmod, rename]
    decision: allow

  - name: approve-workspace-delete
    paths: ["/workspace", "/workspace/**"]
    operations: [delete, rmdir]
    decision: approve
    message: "Delete {{.Path}}?"
    timeout: 5m

  - name: deny-ssh-keys
    paths: ["/home/**/.ssh/**", "/root/.ssh/**"]
    operations: ["*"]
    decision: deny

network_rules:
  - name: allow-api
    domains: ["api.example.com"]
    ports: [443]
    decision: allow

command_rules:
  - name: block-dangerous
    commands: ["rm", "shutdown", "reboot"]
    decision: deny

Using a policy

# Start the server with your policy
./bin/agentsh server --config configs/server-config.yaml

# Create a session pinned to a policy
SID=$(./bin/agentsh session create --workspace /workspace --policy default --json | jq -r .id)

# Exec commands; responses include decision + guidance when blocked/approved
./bin/agentsh exec "$SID" -- rm -rf /workspace/tmp

Authentication

agentsh supports multiple authentication methods:

TypeUse Case
api_keySimple deployments with static keys
oidcEnterprise SSO (Okta, Azure AD, etc.)
hybridBoth methods accepted

Approval modes for human-in-the-loop verification:

  • local_tty - Terminal prompt (default)
  • totp - Authenticator app codes
  • webauthn - Hardware security keys (YubiKey)
  • api - Remote approval via REST

See SECURITY.md for configuration details.

MCP Security

  • Tool Whitelisting: Control which MCP tools can be invoked via allowlist/denylist policies
  • Version Pinning: Detect tool definition changes (rug pull protection) with configurable responses
  • Cross-Server Detection: Block data exfiltration patterns (read from Server A → send via Server B)
  • Rate Limiting: Token bucket rate limiting for MCP servers and network domains

See SECURITY.md for full configuration options, or run the MCP Protection Demo to see these detections in action.


60-second demo

The fastest way to "get it" is to run something that spawns subprocesses and touches the filesystem/network.

# 1) Create a session in your repo/workspace
SID=$(agentsh session create --workspace . --json | jq -r .id)

# 2) Run something simple (human-friendly output)
agentsh exec "$SID" -- uname -a
# → prints system info, just like normal

# 3) Run something that hits the network (JSON output + event summary)
agentsh exec --output json --events summary "$SID" -- curl -s https://example.com
# → JSON response includes: exit_code, stdout, and events[] showing dns_query + net_connect

# 4) Trigger a policy decision - try to delete something
agentsh exec "$SID" -- rm -rf ./tmp
# → With default policy: prompts for approval or denies based on your rules

# 5) See what happened (structured audit trail)
agentsh exec --output json --events all "$SID" -- ls
# → events[] shows every file operation, even from subprocesses

What you'll see in the JSON output:

  • exit_code: the command's exit status
  • stdout / stderr: captured output
  • events[]: every file/network/process operation with policy decisions
  • policy.decision: allow, deny, approve, or redirect

Tip: keep a terminal with --output json open when testing policies—it makes it obvious what's being touched.


Session Reports

Generate markdown reports summarizing session activity:

# Quick summary
agentsh report latest --level=summary

# Detailed investigation
agentsh report <session-id> --level=detailed --output=report.md

Reports include:

  • Decision summary (allowed, blocked, redirected)
  • Automatic findings detection (violations, anomalies)
  • Activity breakdown by category
  • Full event timeline (detailed mode)

See CI/CD Integration Guide for pipeline examples.


Workspace Checkpoints

Create snapshots of workspace state for recovery from destructive operations:

# Create a checkpoint before risky operations
agentsh checkpoint create --session $SID --workspace /workspace --reason "before cleanup"

# List checkpoints for a session
agentsh checkpoint list --session $SID

# Show what changed since a checkpoint
agentsh checkpoint show <cp-id> --session $SID --workspace /workspace --diff

# Preview what rollback would restore (dry-run)
agentsh checkpoint rollback <cp-id> --session $SID --workspace /workspace --dry-run

# Restore workspace to checkpoint state
agentsh checkpoint rollback <cp-id> --session $SID --workspace /workspace

# Clean up old checkpoints
agentsh checkpoint purge --session $SID --older-than 24h --keep 5

Auto-checkpoint: When enabled, agentsh automatically creates checkpoints before risky commands (rm, mv, git reset, git checkout, etc.). Configure in sessions.checkpoints.auto_checkpoint.

See SECURITY.md for full configuration options.


LLM Proxy and DLP

agentsh includes an embedded proxy that intercepts all LLM API requests from agents:

# Check proxy status for a session
agentsh proxy status <session-id>

# View LLM-specific events
agentsh session logs <session-id> --type=llm

Features:

  • Automatic routing: Sets ANTHROPIC_BASE_URL and OPENAI_BASE_URL so agent SDKs route through the proxy
  • Custom providers: Route to LiteLLM, Azure OpenAI, vLLM, or corporate gateways
  • DLP redaction: PII (emails, phone numbers, API keys, etc.) is redacted before reaching LLM providers
  • Custom patterns: Define organization-specific patterns for sensitive data
  • Usage tracking: Token counts extracted and logged for cost attribution
  • Audit trail: All requests/responses logged to session storage

Provider configuration:

proxy:
  mode: embedded
  providers:
    anthropic: https://api.anthropic.com    # Default Anthropic API
    openai: https://api.openai.com          # Default OpenAI API

    # Or use alternative providers:
    # openai: http://localhost:8000         # LiteLLM / vLLM
    # openai: https://your-resource.openai.azure.com  # Azure OpenAI
    # anthropic: https://llm.corp.example.com         # Corporate gateway

DLP configuration:

dlp:
  mode: redact
  patterns:
    email: true
    api_keys: true
  custom_patterns:
    - name: customer_id
      display: identifier
      regex: "CUST-[0-9]{8}"

See LLM Proxy Documentation for full configuration options.

The same proxy also dispatches declared http_services entries — named API upstreams with per-method, per-path rules. See Declared HTTP Services and the HTTP Services Cookbook for details.


Policy Generation

Generate restrictive policies from observed session behavior ("profile-then-lock" workflow):

# Generate policy from latest session
agentsh policy generate latest --output=ci-policy.yaml

# Generate with custom name and threshold
agentsh policy generate abc123 --name=production-build --threshold=10

# Quick preview to stdout
agentsh policy generate latest

The generated policy:

  • Allows only operations observed during the session
  • Groups paths into globs when many files in same directory
  • Collapses subdomains into wildcards (e.g., *.github.com)
  • Flags risky commands (curl, wget, rm) with arg patterns
  • Includes blocked operations as commented-out rules for review

Use cases:

  • CI/CD lockdown: Profile a build/test run, lock future runs to that behavior
  • Agent sandboxing: Let an AI agent run a task, generate policy for future runs
  • Container profiling: Profile a workload, generate minimal policy for production

Database Access (PostgreSQL)

agentsh includes an embedded PostgreSQL proxy that makes database access agent-aware and policy-governed. It speaks the Postgres wire protocol, classifies every statement into a list of effects (reads, writes, DDL, DCL, transaction/session control, bulk COPY/export, …), and evaluates each effect against database_rules before forwarding upstream — so an UPDATE, DROP, or unscoped DELETE is governed the same way a file write or network connect is.

  • Per-effect, multi-object evaluation — rules are collect-all / any-deny-wins (the most-restrictive verb decides), not first-match.
  • Decisions: allow, deny, approve (human OK), audit, and statement-level redirect.
  • require_where guard — refuse top-level UPDATE/DELETE that lack a WHERE clause.
  • Connection-level rules (database_connection_rules) gate which sessions may reach which declared db_service.
  • Auth + statement audit events for every connection and query; statement-text logging is configurable (policies.db.log_statements: none | parameters_redacted | full).

Phase 1 covers the PostgreSQL v3 wire protocol (dialects: postgres, aurora_postgres; redshift / cockroachdb in beta). Replication and GSSAPI-encrypted connections default-deny.

database_rules:
  # normal reads + updates on the declared service
  - name: app-read-and-update
    db_service: appdb
    operations: [READ, UPDATE]
    decision: allow

  # allow UPDATE/DELETE only when scoped by a WHERE clause
  - name: app-guard-unscoped-dml
    db_service: appdb
    operations: [UPDATE, DELETE]
    require_where: true
    decision: allow

  # block schema/DDL mutations; terminate the transaction on violation
  - name: app-deny-ddl
    db_service: appdb
    operations: [CREATE, DROP, ALTER, EXPORT]
    decision: deny
    deny_mode_in_tx: terminate
    message: "appdb is read+update only. Requested: {{.Operation}}"

See the Database Access Control spec for the full operation taxonomy, effects model, connection rules, and unavoidability threat model.


Network Redirect

agentsh can transparently redirect DNS and TCP connections, enabling use cases like routing API calls through corporate proxies or switching AI providers without code changes.

DNS Redirect

Intercept DNS resolution and return configured IP addresses:

dns_redirect:
  - match: "api.anthropic.com"
    redirect_ip: "10.0.0.50"
    visibility: audit_only
    on_failure: fail_closed

  - match: ".*\\.openai\\.com"    # Regex pattern
    redirect_ip: "10.0.0.51"
    visibility: warn

Connect Redirect

Redirect TCP connections to different destinations with optional TLS handling:

connect_redirect:
  - match: "api.anthropic.com:443"
    redirect_to: "vertex-proxy.internal:8443"
    tls_mode: passthrough          # Forward encrypted traffic unchanged
    visibility: silent

  - match: "api.openai.com:443"
    redirect_to: "azure-proxy.internal:443"
    tls_mode: rewrite_sni          # Modify SNI in TLS ClientHello
    rewrite_sni: "azure-openai.example.com"
    visibility: audit_only

Options

FieldValuesDescription
visibilitysilent, audit_only, warnHow redirects are logged/shown
on_failurefail_closed, fail_open, retry_originalWhat happens if redirect fails
tls_modepassthrough, rewrite_sniTLS handling for connect redirect

Platform Support

FeatureLinuxmacOSWindows
DNS Redirect✅ eBPF✅ pf/proxy✅ WinDivert
Connect Redirect✅ eBPF✅ pf/proxy✅ WinDivert
SNI Rewrite

Use Cases

  • API Gateway Routing: Route Anthropic/OpenAI calls through corporate LLM gateway
  • Provider Switching: Redirect Claude API to GCP Vertex AI or Azure OpenAI
  • Testing: Redirect production APIs to mock servers
  • Compliance: Force all LLM traffic through audit proxies

Signal Filtering

agentsh intercepts signals (kill, SIGTERM, etc.) sent between processes, providing policy-based control over which signals can reach which targets.

Platform Support

PlatformBlockingRedirectAudit
LinuxYes (seccomp user-notify)YesYes
macOSNoNoYes (ES)
WindowsPartialNoYes (ETW)

Example Signal Rules

signal_rules:
  # Allow signals to self and children
  - name: allow-self
    signals: ["@all"]
    target:
      type: self
    decision: allow

  - name: allow-children
    signals: ["@all"]
    target:
      type: children
    decision: allow

  # Redirect SIGKILL to graceful SIGTERM
  - name: graceful-kill
    signals: ["SIGKILL"]
    target:
      type: children
    decision: redirect
    redirect_to: SIGTERM

  # Block fatal signals to external processes
  - name: deny-external-fatal
    signals: ["@fatal"]
    target:
      type: external
    decision: deny

Signal Groups

  • @all - All signals (1-31)
  • @fatal - SIGKILL, SIGTERM, SIGQUIT, SIGABRT
  • @job - SIGSTOP, SIGCONT, SIGTSTP, SIGTTIN, SIGTTOU
  • @reload - SIGHUP, SIGUSR1, SIGUSR2

Target Types

  • self - Process signaling itself
  • children - Direct child processes
  • descendants - All descendant processes
  • session - Any process in the agentsh session
  • external - PIDs outside the session
  • system - PID 1 and kernel threads

See Policy Documentation for full configuration options.


macOS File I/O Monitoring

On macOS, agentsh monitors file I/O using the Endpoint Security Framework (ESF), subscribing to both AUTH and NOTIFY events. Tracked operations include file open, create, delete, rename, write (detected via close-modified), and on macOS 26+, chmod and chown via attribute change events. Every file event is attributed to the originating session and command through PID-based resolution, providing full audit trails across subprocess trees.

ESF provides kernel-level allow/deny enforcement but does not support transparent file interception like Linux FUSE. Policy actions that require interception -- such as redirect (path rewriting) and soft_delete (quarantine) -- are implemented as deny + guidance: the operation is blocked at the ESF level and the agent receives instructions to retry with the correct path or to acknowledge that the file is protected. See the macOS ESF+NE architecture doc for event stream details and the policy documentation for per-action behavior.


Starter policy packs

You already have a default policy (configs/policies/default.yaml). These opinionated packs are available as separate files so teams can pick one:

  • policies/dev-safe.yaml: safe for local development

    • allow workspace read/write
    • approve deletes in workspace
    • deny ~/.ssh/**, /root/.ssh/**
    • restrict network to allowlisted domains/ports
  • policies/ci-strict.yaml: safe for CI runners

    • deny anything outside workspace
    • deny outbound network except artifact registries
    • deny interactive shells unless explicitly allowed
    • audit everything (summary events)
  • policies/agent-sandbox.yaml: "agent runs unknown code" mode

    • default deny + explicit allowlist
    • approve any credential/path access
    • redirect network tool usage to internal proxies/mirrors
    • soft-delete destructive operations for easy recovery

AI Assistant Integration Examples

Ready-to-use snippets for configuring AI coding assistants to use agentsh:

  • Claude Code - CLAUDE.md snippet for Claude Code integration
  • Cursor - Cursor rules for agentsh integration
  • AGENTS.md - Generic AGENTS.md snippet (works with multiple AI tools)

Note: These examples are for local development scenarios where running the AI agent inside a container isn't practical. For production or CI/CD environments, prefer running agents in containers with the shell shim installed—see Use in Docker.


References


Created with the help of agents for agents.