Integrations

June 7, 2026 · View on GitHub

Part of the AgentBox docs. Start at CLAUDE.md. Planning context: integrations_backlog.md (the four-service plan). Per-task tracker for Notion: notion_backlog.md. The user-facing page is apps/web/content/docs/integrations-notion.mdx (published at https://agent-box.sh/docs/integrations-notion).

This is the design / reference doc for the host-side integrations spine — the box-to-host bridge that lets an in-box agent read tickets/docs from Notion (and, in future, Linear / Trello / ClickUp) and make a small, prompted set of writes, without ever holding the service's credentials inside the box. The shape mirrors the existing gh and git relay flows exactly.

Why this exists

The host owns the credentials. The box is the untrusted side. A box agent should be able to read tickets/docs freely (a search, a GET) and write with the user's per-call approval (a page.create, a comment.add), but the token must never enter the box. The model is the one we already proved with gh:

  • An in-box shim (gh-shim) intercepts a strict subcommand allowlist and forwards through agentbox-ctl.
  • agentbox-ctl POSTs /rpc on the box-local relay (bearer-authed, see host-relay.md).
  • The relay classifies the op as read or write. Reads pass; writes go through askPrompt (host approval), then shell out to the host's authenticated CLI. The token stays on the host.

Integrations generalize this for any host CLI: each service is one connector descriptor in @agentbox/integrations, and the relay's integration.<service>.<op> dispatcher walks the same path.

Where the gate lives

The gate lives in the relay, not in the box. The in-box ctl is unprivileged; it sends an RPC and waits for a verdict. The relay (a host process) is the only thing that runs the host CLI, and it's the only thing that consults the per-project integrations.<svc>.enabled flag, the op's read/write classification, the op's refuseCall pre-flight, and askPrompt for writes. One check covers every caller — the shim, the notion/ntn alias, a direct agentbox-ctl integration invocation, a future host-initiated one-time token. See "gate at the host boundary" in the user feedback notes.

The connector descriptor

packages/integrations/src/types.ts defines two types:

export interface IntegrationConnector {
  service: IntegrationService;                // 'notion' (more later)
  hostBin: string;                            // 'ntn'
  detect: {                                   // T3 doctor probes
    versionArgs: readonly string[];
    authArgs?: readonly string[];
    installHint?: string;                     // shown by `agentbox doctor` when missing
    loginHint?: string;                       // shown when unauthed
  };
  env?: Readonly<Record<string, string>>;     // forced env vars; <SERVICE>_* only
  ops: Readonly<Record<string, IntegrationOp>>;
}

export interface IntegrationOp {
  write: boolean;                             // false = read, true = gated write
  buildArgv?: (args: readonly string[]) => string[];   // shape user argv → host CLI argv
  refuseCall?: (args: readonly string[]) => IntegrationOpRefusal | null;
}

Pure data + small predicates. No I/O at import time, so unit tests stay pure. The descriptor file (packages/integrations/src/connectors/notion.ts) is the single source of truth for the box surface, the relay's allowlist, and (since T3) the doctor's install/login hint strings.

A registry.ts exports getConnector(service) and ALL_CONNECTORS. Adding a service is a new descriptor file + a one-line registry add. No relay change, no ctl change.

env-var namespace guard

packages/relay/src/integrations.ts:mergeConnectorEnv enforces that a descriptor can only set env vars in its own <SERVICE>_* namespace (so a careless descriptor can never set PATH or AGENTBOX_PROMPT). A misconfigured descriptor returns a typed exit-78 envelope rather than silently disabling the relay's gate or rewriting PATH. No connector currently declares an env override — the relay runs each host CLI with its own default auth. (ntn reads the macOS keychain after ntn login; linear reads ~/.config/linear/credentials.toml.) The field stays as an opt-in escape hatch for a future CLI that needs specific env to resolve its auth.

The relay dispatch flow

packages/relay/src/integrations.ts is the spine. The dispatcher in packages/relay/src/server.ts (docker) and packages/relay/src/host-actions.ts (cloud) calls into it for any method starting with integration.. Per the "fix across all providers" rule, both paths share the exact same handler.

For integration.<service>.<op>:

  1. parseIntegrationMethod splits on the first two dots; dotted ops (page.create) keep their dot. Unknown shape → exit 64.
  2. getConnector(service) — unknown service → exit 64.
  3. op allowlist — unknown op → exit 65, with the list of available ops.
  4. worktree resolveparams.path → which registered worktree (cwd for the host CLI spawn).
  5. refuseIntegrationCall(op, args) — runs the op's refuseCall pre-flight (e.g. notion.api's GET-only check). Refused → exit 65, before any host process is spawned.
  6. refuseIfIntegrationDisabled(service, cwd) — re-reads the layered config every call (so a flag flip takes effect without bouncing the relay; same approach as loadAutopauseConfig). Disabled → exit 65 with a config-hint. Runs before any host probe / prompt so a disabled integration is never user-visible as a permission prompt.
  7. assertIntegrationReady(connector) — cached for 60s per hostBin. Probes <hostBin> <versionArgs> to make sure the binary exists. Missing → exit 127. Failed version → propagate exit.
  8. Write gating. For op.write === true:
    • If params.hostInitiated is set, validate it against HostInitiatedTokens (scope + params-hash bound). A present-but-invalid token is a hard reject (attack signal — exit 10).
    • Otherwise (or for any unbound write) askPrompt(...) blocks until the host answers y / n. Denied → exit 10.
    • Read ops skip both gates entirely.
  9. runHostIntegration spawns the host binary in the worktree's hostMainRepo, with the connector's env merged on top of process.env (subject to the namespace guard). Returns the standard {exitCode, stdout, stderr} envelope.

Read vs write — the Notion op surface

packages/integrations/src/connectors/notion.ts carries the current allowlist. Intentionally minimal — start conservative, widen as real agent flows surface needs.

OpClassHost argvNotes
whoamireadntn whoamidedicated op so the agent doesn't need to widen the api allowlist.
apireadntn api <args>refuseUnsafeApiCall: GET to any endpoint; POST only to v1/search, v1/databases/{id}/query, v1/data_sources/{id}/query (body via -d <JSON> / inline path:=json). Method inferred from -d/inline body, -X/--method overrides; other methods/POSTs + --input/--file refused.
page.createwritentn pages create <args>gated by askPrompt. (User-facing shim form: ntn pages create ….)
page.updatewritentn pages update <args>gated; covers archive + props. (User-facing shim form: ntn pages update ….)

comment.add is intentionally absent — ntn exposes no top-level comment subcommand. The only path is ntn api v1/comments -X POST -d …, which the api op refuses (POST is allowed only for the read endpoints — search + database/data-source query — not v1/comments). Comment creation needs a Notion-API-aware payload assembler that maps CLI flags to the structured POST /v1/comments body; tracked as a follow-up in notion_backlog.md. The in-box shim rejects notion comment add … with a clear "deferred" message.

The enable flag

integrations.notion.enabled (typed config, default false) lives in packages/config/src/types.ts (UserConfig, EffectiveConfig, BUILT_IN_DEFAULTS, KEY_REGISTRY). The config parser/merger/writer were taught to walk three-level nested keys (branch.subbranch.leaf) for this, so the YAML reads naturally. Layered the usual way: CLI > workspace > project > global > built-in.

Toggle per project:

agentbox config set --project integrations.notion.enabled true

Default off so every box ships the shim but it's inert until the user opts in — no surprise box→host calls.

In-box surface

packages/sandbox-docker/scripts/ntn-shim is the gh-shim pattern: strict subcommand allowlist (whoami, api, page, …) → exec agentbox-ctl integration notion <op> -- "$@". Anything off the allowlist is rejected with a clear message. The same shim is symlinked at /usr/local/bin/notion so the agent can type either name.

Staging follows the canonical pattern (see the feedback-canonical-dockerfile-box-location memory):

  • Listed in contextFiles + execBitFiles in apps/cli/scripts/stage-runtime.mjs.
  • COPY'd into Dockerfile.box next to the gh-shim / git-shim block.
  • Mirrored into packages/sandbox-hetzner/scripts/install-box.sh, packages/sandbox-vercel/scripts/provision.sh, and packages/sandbox-e2b/scripts/build-template.sh, plus the matching src/runtime-assets.ts upload lists. Daytona stays shim-less.

Doctor

agentbox doctor reports each integration in a dedicated integrations: group, driven off ALL_CONNECTORS (no hardcoded 'notion' in the doctor — Linear/Trello will light up automatically when they ship). Per connector:

  • Disabled (default, layered config) → [info] notion disabled (enable with \agentbox config set --project integrations.notion.enabled true`). infois a new status that rolls up likeok` so a disabled integration never pushes the overall doctor status to "warn".
  • Enabled + binary missing[warn] notion ntn not installed (install ntn: https://developers.notion.com/reference/notion-cli). Hint string comes from connector.detect.installHint.
  • Enabled + binary present + unauthed[warn] notion not logged in (ntn login). Hint from connector.detect.loginHint.
  • Enabled + binary present + authed[ ok ] notion ntn version X.Y.Z · authed.

The doctor auth probe runs each connector's CLI with no forced env, exactly as the relay does. So a host's real authed state — the macOS keychain after ntn login — is what's reported, and doctor can't show "authed" for an auth path the relay wouldn't actually use. See the comment in apps/cli/src/lib/doctor-checks.ts:integrationsChecks.

The live ntn host probe is the orchestrator's post-merge check — it can't be verified inside an AgentBox box because the real ntn isn't installed there. The unit test (apps/cli/test/doctor-integrations.test.ts) stubs a fake ntn on PATH so the four status transitions are exercised in CI.

Carry-based file-auth for nested boxes

For the nested-box dev path (box → box, exercise the integration from inside a box), the host's ntn auth is carried into the box as a file. agentbox.yaml's carry: block ships ~/.config/notion/auth.json into the box; the host must have been logged in file-mode (NOTION_KEYRING=0 ntn login) for that file to exist, and the in-box ntn may need NOTION_KEYRING=0 exported to read it (the connector no longer forces the env — see docs/development.md). This is internal-dev only; normal boxes carry no Notion credential and reach ntn purely through the host relay. Even on the nested path the token lives only at the leaf hop, never in the agent's process env (printenv | grep -i notion shows nothing).

Carry is host→box and one-prompt-approved (see features.mdcarry:). T4 wires the actual e2e verification.

Verification / live e2e results

T4 ran the integration against the live Notion API from inside a real box. Captured evidence:

  • notion whoami (read) — round-trips through the in-box shim → host relay → host ntn → Notion API; returns the host bot identity with no approval prompt.
  • notion api v1/users/me (read) — same path; returns the host bot's JSON identity record. No prompt.
  • notion api … -X POST and --method PATCH (refused) — the connector's refuseApiNonGet correctly classifies these as writes and blocks them before any host process is spawned: notion api: only GET is proxied (use page.create / page.update for writes); detected method 'POST', exit 65.
  • No agent-side credentialprintenv | grep -i notion in the box returns nothing in the agent's environment. The token lives only on the host. The carried ~/.config/notion/auth.json file is for nested-box relay hosts and never reaches the agent's process env.
  • Connector argv bug (fixed in T4) — a live notion pages create through the host relay (rebuilt from T3 code) failed with error: unrecognized subcommand 'page'. tip: some similar subcommands exist: 'update', 'pages'. Real ntn's surface is api datasources files pages login logout whoami workers. The connector's buildArgv was building singular ['page', 'create', …]; T4 changed it to ['pages', 'create', …] and ['pages', 'update', …]. Live write round-trip against the fix lands after the host relay rebuilds with the merged T4 code.
  • agentbox config get nested-key bug (fixed in T4)config get integrations.notion.enabled was returning <unset> even though config set + loadEffectiveConfig worked correctly, because apps/cli/src/commands/config.ts split keys on the FIRST dot only. T4 replaced the helpers with a walkKey function that walks all segments (mirrors readLeaf in packages/config/src/load.ts). New regression test apps/cli/test/config-get-nested.test.ts.

Nested-box e2e — deferred, not blocking

The nested-box scenario (a box-inside-a-box running a notion op through this box's relay) was time-boxed in T4 and deferred. Architecturally, the in-box agentbox-ctl daemon (port 8788) forwards /rpc to the HOST relay (host.docker.internal:8787), not to a relay running in this box — so a nested box's notion pages create would still terminate at the host relay's spawn, not in this box's daemon. That means nested-box e2e exercises the carry mechanics (already verified — ~/.config/notion/ present in this box) more than the connector's spawn path. A future follow-up that lifts the relay into the box's daemon would change this; tracked under "Open follow-ups" below.

Cross-provider parity

integration.<service>.<op> is dispatched identically on docker and cloud because the wire shape is method-agnostic. The cloud path long-polls /bridge/poll, runs executeCloudAction → runIntegrationRpc, which reuses the exact handler. The Hetzner / Daytona / Vercel / E2B image flows all ship the ntn / notion shim (see "In-box surface" above). No provider-specific code in the integrations spine.

Linear

The Linear path of the integrations foundation, shipped under LT1 (descriptor-only — no relay/ctl core change, validating the abstraction the Notion work built). The connector descriptor lives in packages/integrations/src/connectors/linear.ts; in-box shim at packages/sandbox-docker/scripts/linear-shim. Backed by @schpet/linear-cli (the linear binary, v2). Tracker: linear_backlog.md.

Op surface

packages/integrations/src/connectors/linear.ts carries the current allowlist. Same starter-conservative shape as Notion's: reads pass through, writes go through askPrompt.

OpClassHost argvNotes
whoamireadlinear auth whoamiidentity only — never linear auth token (see below).
issue.listreadlinear issue list
issue.minereadlinear issue minev2-native "issues assigned to me" (the old list --me path was dropped upstream).
issue.viewreadlinear issue view
issue.queryreadlinear issue querystructured filters.
team.listreadlinear team list
apireadlinear api <query>GraphQL query passthrough; refuseGraphqlNonQuery rejects mutation / subscription and any --variable key=@<path> host-file load.
issue.createwrite (gated)linear issue create
issue.updatewrite (gated)linear issue updatestatus/title/etc.
issue.commentwrite (gated)linear issue comment add@schpet/linear-cli v2 uses add (not create).

The auth-token exclusion (key security invariant)

linear auth token PRINTS the raw API token to stdout. It is never on the allowlist:

  • The shim hard-rejects linear auth token with 'auth token' leaks the raw API key — refused. Use 'linear whoami' for identity. (exit 2). Same hard-reject for auth login / auth logout / auth migrate / auth default — the host owns auth state.
  • The connector exposes no op that maps to auth token. The only auth-family op is whoami, which maps to linear auth whoami (identity only).
  • The relay's allowlist (the connector's ops map) denies any RPC whose op isn't on the list, so even if the shim were bypassed, the relay would refuse.

Three defenses, all in series. A box agent can't reach linear auth token through any of them.

issue delete / team delete / team create are similarly off-list (destructive; start conservative, widen deliberately).

The GraphQL mutation gate (refuseGraphqlNonQuery)

Linear's api subcommand is a raw GraphQL endpoint — one POST that serves both queries (read) and mutations (write). The api op is a read passthrough, so it carries refuseCall: refuseGraphqlNonQuery. The predicate:

  • Walks argv, consuming the value after value-bearing flags (--variable, --variables-json) so their payload isn't misread as a positional GraphQL source.
  • For every remaining positional, strips leading whitespace + # … line comments and refuses if the first keyword is mutation or subscription (exit 65, linear api: only GraphQL queries are proxied …).
  • query … and the anonymous { … } shorthand pass; empty/flag-only argv passes (the host CLI emits its own usage error).
  • --input / --input=… is refused — stdin/file bodies can't traverse the relay anyway.
  • --variable key=@<path> is refused. linear-cli's @<path> syntax reads from a host file and sends the contents as a GraphQL variable, which the box could echo back through the query response — an exfiltration channel. The guard rejects every split/glued/equals shape of the flag.

The match is case-insensitive (defensive — GraphQL is case-sensitive in spec, but the cost of guarding is zero). The parser is not a GraphQL validator; it's a write-shape detector. Writes go through the dedicated gated issue.* ops, never api.

Enable flag

integrations.linear.enabled (typed config, default false) lives next to the Notion flag in packages/config/src/types.ts. Same layering, same disabled-default rationale.

agentbox config set --project integrations.linear.enabled true

env / credentials

Linear stores plaintext credentials at ~/.config/linear/credentials.toml (keyring is opt-in, not used), and linear reads that file on every host by default — so the connector declares no env block (neither connector does; mergeConnectorEnv would only allow LINEAR_* keys anyway). The agentbox.yaml carry: block ships the file into nested boxes that run their own relay.

Verification / live e2e results

LT2 ran the integration against the live waldosai workspace from inside a real AgentBox box. Captured evidence:

  • linear whoami (read) — round-trips through the in-box shim → host relay → host linear v2.0.0 → Linear API; returns Workspace: waldosai … User: Marco D'Alia … Role: admin with no approval prompt.
  • linear issue mine/list --team WAL and linear team list (reads) — same path; exit 0, no prompt. team list returns WAL Waldosai (the team key used for the writes).
  • linear api '{ viewer { id name email } }' (read) — the GraphQL passthrough returns the viewer JSON. refuseGraphqlNonQuery correctly classifies the { … } shorthand as a query (anonymous) and lets it through.
  • linear api 'mutation { … }' (refused) — exits 65 with linear api: only GraphQL queries are proxied (use issue.create / issue.update / issue.comment for writes); detected operation 'mutation'. Refused before any host process is spawned; reproduces through both the shim path and the direct agentbox-ctl integration linear api path (the gate lives in the connector, not the shim).
  • linear auth token (refused at the shim) — exits 2 with 'auth token' leaks the raw API key — refused. Use 'linear whoami' for identity.. The relay's op allowlist would also refuse it (no op maps to auth token); the shim is the first of three defenses.
  • Gated writes — three approve→succeed→ground-truth-read cycles.
    • linear issue create --team WAL --title "agentbox LT2 e2e …" -d "…" created WAL-5 (URL linear.app/waldosai/issue/WAL-5/…). linear issue view WAL-5 confirms title + description + Backlog state.
    • linear issue comment add WAL-5 -b "agentbox LT2 e2e comment via host relay (gated write)" added the comment. linear api '{ issue(id:"WAL-5") { … comments { nodes { body } } } }' confirms the comment body matches verbatim.
    • linear issue update WAL-5 -s "Canceled" moved the state. Post-update linear issue view WAL-5 shows **State:** Canceled.
  • No agent-side credentialprintenv | grep -E '^LINEAR' inside the box returns nothing. The only token-shaped env var is AGENTBOX_RELAY_TOKEN. The carried ~/.config/linear/credentials.toml sits on disk for the nested-box case (see below) but is not consumed during the primary e2e — the host's own linear reads its own ~/.config/linear/ host-side.
  • No source changes needed. LT1's connector + shim + gate worked unchanged against the live host CLI — no LT4-style argv drift this round.

Nested-box e2e — deferred, not blocking

The nested-box scenario (a box-inside-a-box running a linear op through this box's relay) was time-boxed in LT2 and deferred for the same architectural reason as Notion. The in-box agentbox-ctl daemon forwards /rpc to the HOST relay (host.docker.internal:8787), not to a relay running in this box — so a nested box's linear issue create would still terminate at the host relay's spawn, not in this box's daemon. That means nested-box e2e exercises the carry mechanics (already verified — ~/.config/linear/credentials.toml present in this box) more than the connector's spawn path. Installing the real linear in this box would also break the primary e2e by shadowing: npm's global prefix here is /usr, so npm i -g @schpet/linear-cli lands the real binary at /usr/bin/linear, but the shim at /usr/local/bin/linear precedes it on $PATH and keeps winning resolution — the in-box agent would still hit the shim, and the daemon would need a separately-shaped PATH (or an absolute hostBin path) to reach the real binary, which is out of scope here. A future follow-up that lifts the relay into the box's daemon would change this; tracked under "Open follow-ups" below alongside the Notion entry.

Open follow-ups

  • Trello / ClickUp — see integrations_backlog.md. Each is a new descriptor + a small shim; no relay change. ClickUp will be the one custom REST connector (no good CLI on PyPI / npm).
  • comment.add — deferred; needs a Notion-API-aware payload translator that maps CLI flags to the structured POST /v1/comments body.
  • Least-privilege tokens — Notion capability toggles for the host token; Trello supports scope=read (when we add it); Linear personal keys inherit full user perms (OAuth-only for read-scope tokens). Document on each service's user-facing page.
  • Host-initiated tokens — the relay already accepts params.hostInitiated and validates it against HostInitiatedTokens (scope + params-hash bound). The host-CLI mint path that issues those tokens isn't wired yet for integrations; once it is, a host-typed agentbox-ctl integration notion page.create … can skip the prompt by minting a token first (same shape as the existing gh.pr.* and cp.* host-initiated paths).
  • Nested-box e2e — deferred for both Notion (T4) and Linear (LT2) for the same architectural reason (in-box agentbox-ctl forwards /rpc to the original host relay, so a nested-box's /rpc terminates at the host's spawn regardless). Lifting the relay into the box's daemon would change this — tracked here, not blocking either path. The carry-based file-auth mechanics are already verified (the carried files land at the expected per-service paths).