Multi-tenant onboarding

May 17, 2026 · View on GitHub

Echo's HTTP+JSON + MCP transports support per-bearer ACL via the BearerRegistry config. Each tenant gets a dedicated bearer; tenant-scoped bearers can only read events from clusters in their ACL list; wildcard bearers (no ACL list) read everything.

When to use multi-tenant mode

Single-tenant (default)Multi-tenant
One bearer, full accessN bearers, ACL-scoped
--auth-token-file <path>--bearer-acl-file <path>
Operator owns the bearerOperator mints + distributes bearers to tenants
All clusters visible to all callersPer-tenant cluster_id scoping enforced at dispatch

ACL config schema

bearer-acl-file is a YAML or JSON document. Schema:

- key_name: "platform-team"
  token: "<hex token>"
  allowed_clusters: []           # empty = wildcard (all clusters)
- key_name: "tenant-a"
  token: "<hex token>"
  allowed_clusters:
    - cluster-prod-us-east-1
    - cluster-prod-us-west-2
- key_name: "tenant-b"
  token: "<hex token>"
  allowed_clusters:
    - cluster-prod-eu-central-1

Fields:

  • key_name: operator-supplied label. Surfaces in audit logs as bearer_key_name. NEVER returned to clients via /api/v2/whoami (R3 ★3: key names convert opaque bearers into role labels that downstream compromise can pivot off).
  • token: the bearer value. Mint with openssl rand -hex 32 (256 bits of entropy).
  • allowed_clusters: explicit list of cluster_id values the bearer is permitted to query. Empty = wildcard. Tenant-scoped bearers attempting to query an out-of-list cluster receive HTTP 403 tenant_scoped_bearer_refused.

Minting a bearer

NEW_BEARER=$(openssl rand -hex 32)
echo "key_name: tenant-c"     >> /tmp/bearer-acl.yaml
echo "token: $NEW_BEARER"     >> /tmp/bearer-acl.yaml
echo "allowed_clusters:"      >> /tmp/bearer-acl.yaml
echo "  - cluster-staging-1"  >> /tmp/bearer-acl.yaml

Distribute $NEW_BEARER to the tenant via your secrets-management channel of choice (Vault, 1Password, encrypted email; not Slack DMs).

Rotating an ACL config

Echo re-reads the ACL file on SIGHUP. The new ACL takes effect immediately; bearers that were valid pre-rotation but are missing from the new file enter the rotation-grace window before they hard-expire.

# Atomically swap the new ACL file.
mv -f /tmp/bearer-acl.new.yaml /etc/ingero/bearer-acl.yaml
pkill -HUP -f ingero-echo

For Helm-deployed Echo, mount the ACL file from a Secret and rotate the Secret + restart the pod (or kubectl exec + pkill -HUP).

Bearer kindCadence
Tenant-scoped (per-customer)90 days
Platform wildcard30 days
Compromise-triggeredImmediate

The 30/90-day numbers are starting points. Adjust based on your secrets-management posture.

Revocation

Two paths:

  1. Soft revoke (recommended for routine cleanup): remove the bearer from the ACL file, SIGHUP. Bearer enters the rotation-grace window (default 5 min) before hard-expiring. In-flight clients have time to switch.
  2. Hard revoke (compromise): remove the bearer from the ACL file AND set rotation_grace: 0 in the rotation config OR pass --rotation-grace=0 at startup. SIGHUP immediately drops the bearer.

In both cases the audit log records the revocation event:

event=bearer_rotation_applied
live_hash=<sha256 of new live bearer>
grace_hashes=[<previous live, if any>]
grace_expires_at=...
evicted_bearers=<count of bearers dropped>

Bearer-hash recovery path

The audit log records bearer_hash (SHA-256 of the raw token) on every request, never the raw token. To trace a hash back to a bearer's key_name, the operator must hold the original ACL config (the hash is one-way).

Recommended: every ACL rotation produces a hash → key_name mapping written to a separate audit-only log line, retained for forensic correlation:

yq eval '.[] | "\(.key_name)\t" + (.token | @sha256)' /etc/ingero/bearer-acl.yaml \
  >> /var/log/ingero/bearer-acl-history.tsv

Keep this file under the same access controls as the ACL itself; an attacker with read access can correlate bearer hashes in stolen audit logs to key_name identifiers.

Tenant-side flows

Tenants point their Grafana plugin / custom dashboard at Echo's /api/versions endpoint to confirm reachability, then use their bearer on every /api/v2/tools/<name> call. The cluster_id arg on cluster-scoped tools MUST match an entry in their allowed_clusters list:

curl -fsS -X POST \
  -H "Authorization: Bearer $TENANT_BEARER" \
  -H "Content-Type: application/json" \
  -d '{"cluster_id":"cluster-prod-us-east-1","time_window":"1h"}' \
  https://echo.example.com/api/v2/tools/fleet.cluster.anomaly_list

Out-of-list cluster_id returns HTTP 403 tenant_scoped_bearer_refused.

The run_analysis tool (free-form SQL) is REFUSED for tenant-scoped bearers regardless of cluster_id. Tenants who need SQL access get a wildcard bearer (rare; only the platform team should hold these).