Supervised agentic workloads on Restate

June 15, 2026 · View on GitHub

A small demo of the supervisor pattern for an agentic tech-risk remediation workload: a fast executor agent drafts ticket + PR fixes for security findings, a heavy judge agent audits each plan before anything executes, and a policy gate suspends execution until an engineer signs off.

It answers, concretely, the questions about running agentic workloads in an orchestrated runtime:

Question	Where in this demo
Workload identity	Automatic. `ctx.request().id` in `src/remediation.ts` — see below
Policy gates	The `approval` durable promise + `approve` handler in `src/remediation.ts`
Eval hooks	The judge + `export eval` steps in `src/remediation.ts`
Audit	The journal itself, the `status` handler, and the `export audit record` step
Guardrails	`requiresHumanSignoff()` and the judge's checks — it's just code
Rogue sub-agents / recalibration	The `ESCALATION_LADDER` loop in `src/remediation.ts` — see below

The external world (LLMs, GitHub, Jira, Slack, LangSmith, SIEM) is stubbed in src/stubs.ts; every stub is called through ctx.run(...), so swapping in real calls changes nothing about the orchestration.

Workload identity (automatic)

Every execution gets an invocation ID (e.g. inv_1bk06Ltr740Q5Torsxjik25zypll3mL12q). Through Restate's journal and introspection API, that one ID resolves to:

The journal: every step, decision, judge verdict, and approval — in order, with inputs/outputs (restate invocations describe <id>, or the UI on port 9070).
The immutable deployment ID: deployments in Restate are immutable — a new code revision is a new deployment, and each invocation is pinned to the deployment it runs on. So the invocation ID tells you the exact code the agent ran.
The prompt version: pinned per deployment as PROMPT_VERSION and journaled into the workload identity record.
Evals/traces/audit logs sent elsewhere: every export step includes the identity, so external systems (LangSmith, your SIEM) join back to the exact code + journal + prompts. Restate also emits OpenTelemetry traces keyed by invocation ID.

Supervisor recalibration: rogue sub-agents are shut down, not retried

You have fine-grained control over error handling and retries. Two different failure modes get two different mechanisms:

Transient infra failure (LLM API flake, network blip): Restate retries the step with the same intent. That's plumbing, invisible to the agent logic.
Semantic failure (a sub-agent goes rogue: bad plan, runaway execution): this is a supervisor decision, expressed as plain control flow in the workflow. The supervisor deactivates that sub-agent and deduces an alternate path — it never blindly re-runs the rogue one.

How the demo implements it (ESCALATION_LADDER loop in src/remediation.ts):

Delegation = its own invocation. Each sub-agent call (ctx.serviceClient(subAgents).fastExecutor(...)) is a separate invocation with its own invocation ID and journal — the sub-agent is observable and governable independently of the supervisor.
Runaway behavior → shutdown. The call is bounded with .orTimeout(...); on timeout the supervisor calls ctx.cancel(subAgentInvocationId) — the sub-agent is shut down (its compensation logic can run), and the supervisor moves to the next rung.
Rogue output → recalibration. The judge audits every candidate plan. On rejection, the supervisor activates the next agent in the ladder (fast executor → conservative executor) instead of retrying the rogue one.
Crash-safe decisions. Every decision (which agent was activated, shut down, rejected, and why) is journaled and mirrored to state (attempts). If the supervisor itself crashes or is redeployed mid-recalibration, replay restores its decisions — a sub-agent that was shut down stays shut down; resume never re-executes it.
Exhaustion → escalation. If no agent produces an acceptable plan, the workflow terminates with the full attempt trail, for manual handling.

The recalibration trail is part of the audit surface:

curl localhost:8080/RemediationWorkflow/cve-003/status
# { "attempts": [
#     { "agent": "fastExecutor", "subAgentInvocationId": "inv_...", "outcome": "rejected: Plan contains destructive operations." },
#     { "agent": "conservativeExecutor", "subAgentInvocationId": "inv_...", "outcome": "accepted" } ], ... }

Try it: submit a finding whose repo touches a database (e.g. "repo": "prod-db") — the fast executor goes rogue with a destructive plan, the judge catches it, and the supervisor recalibrates to the conservative agent.

Policy gates

A durable step that either auto-clears or hands off to a human:

requiresHumanSignoff(finding, verdict) is plain, reviewable code.
If sign-off is needed, the workflow awaits ctx.promise("approval") and suspends — no process held hostage, survives restarts and redeploys, can wait days.
The approve shared handler resolves the gate; the payload (approver, comment) lands in the durable journal → you can always answer who signed off on which generated fix, and why.

Eval hooks

The judge agent runs as a durable step. Its verdict is persisted three ways:

Journal — the step result itself (free, automatic).
Workflow state — ctx.set("verdict", ...), queryable via the status handler or the introspection API while the workflow runs.
External eval system — the export eval step ships it to your analytics stack, coupled to the workload identity (code + journal + prompts).

A failing verdict throws TerminalError: the workload halts before any side effect executes.

Run it

npm install
npx @restatedev/restate-server          # terminal 1: Restate server
npm run app                              # terminal 2: the service (port 9080)
restate deployments register http://localhost:9080   # terminal 3

Human sign-off example

Use the terminal or the Claude-generated UI to submit a request:

Via terminal:

Submit a high-severity finding (pauses at the policy gate):

curl localhost:8080/RemediationWorkflow/cve-005/run --json '{
  "id": "cve-005", 
  "source": "guardduty", 
  "severity": 
  "high",
  "description": "exposed credentials in CI logs", "repo": "ci-infra"
}'

Audit the suspended workload (or open the UI at http://localhost:9070):

curl localhost:8080/RemediationWorkflow/cve-002/status
restate invocations list

Sign off — approver metadata becomes part of the durable journal:

curl localhost:8080/RemediationWorkflow/cve-002/approve --json '{
  "approved": true, "approver": "you@corp.com", "comment": "Verified, LGTM"
}'

The workflow resumes, opens the ticket + PR, and exports the audit record.

Or use the UI:

If you'd rather click than curl, there's a small web UI to submit findings, watch the supervisor recalibrate, and sign off at the policy gate. It's a simple Claude-generated demo UI — nothing Restate-specific, just a thin front-end over the same ingress/admin endpoints you'd hit by hand.

With the Restate server, service, and deployment registered as above, start it in another terminal (no install or build step — it's a dependency-free Node server):

node ui/server.mjs        # serves http://localhost:4321

Then open http://localhost:4321. It proxies to Restate's ingress (:8080) and admin (:9070); override with RESTATE_INGRESS / RESTATE_ADMIN env vars if yours run elsewhere.

Rogue agent recalibration

Submit this finding:

curl localhost:8080/RemediationWorkflow/cve-006/run --json '{
  "id": "cve-006",
  "source": "snyk",
  "severity": "low",
  "description": "SQL injection in session store",
  "repo": "prod-db"
}'

Tests

npm test    # requires Docker; runs against a real Restate server

Tests run with alwaysReplay: true, forcing journal replay of every step on every invocation — non-deterministic handler code fails the test instead of failing in production.

Further materials

Restate AI examples — durable agents, multi-agent patterns, human-in-the-loop, with Vercel AI SDK / OpenAI Agents SDK / Pydantic AI integrations
Managing invocations — cancel, kill, pause, resume; the primitives behind sub-agent governance
Error handling guide — retries vs terminal errors vs sagas/compensations
Sagas guide — undoing the work of a cancelled/rogue sub-agent
Invocations & introspection — the journal/identity model used throughout this demo