SOP: OpenAI Codex Integration

May 9, 2026 ยท View on GitHub

Use this playbook when Veritas Kanban delegates work to OpenAI Codex. v4.3 includes local Codex CLI execution through codex exec, SDK-backed local Codex sessions, GitHub-native Codex Cloud delegation, workflow-engine Codex steps, Codex review actions, richer Settings health checks, provider adapters, Codex event mapping, and mocked runner coverage.


Roles

RoleResponsibilities
Human / PMDefines task scope, confirms Codex mode, reviews outputs, approves final merge.
Veritas OrchestratorCreates worktree, selects provider, starts attempt, tracks status/logs/telemetry.
Codex WorkerImplements, tests, reports final summary, and leaves useful run evidence.
Reviewer AgentPerforms cross-model review when Codex authored code or reviewed another agent.

Codex Modes

ModeUse WhenProvider Shape
Codex CLILocal task execution and deterministic automationcodex exec --json in task worktree
Codex SDKLong-lived local threads and follow-up sessions@openai/codex-sdk server adapter
Codex CloudBackground PR-oriented work through GitHubGitHub issue/PR comment delegation
Codex ReviewReview task branches, PR diffs, or failed changesCLI/SDK review action
Workflow CodexPipeline steps in Veritas workflow definitionsProvider-backed workflow step

Default for v4.3 is Codex CLI. Use Codex SDK when a task needs a durable local thread ID for follow-up prompts or richer session continuity.


Lifecycle Overview

StageActionRequired?
0. ConfigureAdd Codex agent profile and verify codex install/auth.Yes
1. PrepareCreate or verify task worktree; render task prompt.Yes
2. StartVeritas starts provider attempt and marks task in-progress.Yes
3. RunCodex executes with scoped prompt and emits progress/log events.Yes
4. ObserveVeritas maps JSONL/SDK events into attempt logs, activity, telemetry.Yes
5. CompleteVeritas records final summary, deliverables, usage, and task outcome.Yes
6. ReviewOpposite-model review runs for code or high-risk changes.For code
7. CloseHuman or automation approves, merges, archives, or creates follow-ups.Yes

Local Codex CLI Flow

Recommended provider command shape:

codex exec \
  --cwd "<task-worktree>" \
  --sandbox workspace-write \
  --json \
  --output-last-message ".veritas-kanban/codex/<attempt-id>/final.md" \
  "<rendered task prompt>"

Recommended environment:

export VK_API_URL="http://localhost:3001"
export VK_API_KEY="<agent-role-key-if-auth-required>"
export CODEX_API_KEY="<optional-api-key-for-automation>"

Veritas Behavior

  1. Resolve the selected agent to a provider: codex-cli.
  2. Create an attempt with provider metadata:
    {
      "agent": "codex",
      "provider": "codex-cli",
      "model": "gpt-5.5",
      "sandbox": "workspace-write"
    }
    
  3. Run Codex in the task worktree.
  4. Parse JSONL events:
    • thread.started
    • turn.started
    • item.started
    • item.completed
    • turn.completed
    • turn.failed
    • error
  5. Append human-readable attempt logs.
  6. Preserve final response as the completion summary.
  7. Emit telemetry and token usage when available.

Codex SDK Flow

Use SDK mode when the user needs a durable local Codex thread across multiple prompts:

import { Codex } from '@openai/codex-sdk';

const codex = new Codex({ env: { VK_API_URL: 'http://localhost:3001' } });
const thread = codex.startThread({
  workingDirectory: '<task-worktree>',
  sandboxMode: 'workspace-write',
  approvalPolicy: 'never',
  networkAccessEnabled: true,
});
const result = await thread.run('Implement the Veritas task in the current worktree.');

Veritas persists the Codex thread ID in attempt metadata:

{
  "agent": "codex-sdk",
  "provider": "codex-sdk",
  "model": "gpt-5.5",
  "threadId": "thread_..."
}

SDK Session Rules

  • Use fresh threads for independent task attempts.
  • Reuse a thread only when the task explicitly needs follow-up work.
  • Store thread IDs in attempt metadata, not task prose.
  • Surface SDK availability errors clearly in Settings and attempt logs.

Codex Cloud Delegation

Use cloud delegation when the desired output is a GitHub issue/PR workflow rather than direct local worktree execution.

Veritas endpoint:

curl -X POST http://localhost:3001/api/github/codex/delegate \
  -H "Content-Type: application/json" \
  -d '{"taskId":"task_123","target":"issue"}'

Recommended prompt pattern:

@codex Please work on this Veritas Kanban task.

Task: <id> - <title>
Repository: <owner/repo>
Branch/base: <base>
Acceptance criteria:
- <criterion>
- <criterion>

Veritas context:
- Task URL: <local or GitHub-linked URL>
- Related files:
- Required checks:

Please open a PR and include a concise implementation summary, tests run, and any follow-up risks.

Veritas links the GitHub artifact back to the task and tracks cloud delegation as a provider attempt, even though execution happens outside the local runtime.

Attempt metadata:

{
  "agent": "codex-cloud",
  "provider": "codex-cloud",
  "status": "pending",
  "cloudTarget": "issue",
  "cloudUrl": "https://github.com/owner/repo/issues/123"
}

MCP Setup For Codex

Codex should be able to use the Veritas MCP server when configured:

codex mcp add veritas-kanban \
  --env VK_API_URL=http://localhost:3001 \
  -- node /absolute/path/to/veritas-kanban/mcp/dist/index.js

Production or remote API mode:

codex mcp add veritas-kanban \
  --env VK_API_URL=https://kanban.example.com \
  --env VK_API_KEY=<agent-role-key> \
  -- node /absolute/path/to/veritas-kanban/mcp/dist/index.js

Recommended companion:

codex mcp add openaiDeveloperDocs --url https://developers.openai.com/mcp

AGENTS.md Codex Snippet

Add this to a repository where Codex will work with Veritas:

## Veritas Kanban Protocol

When working on Veritas Kanban tasks:

1. Treat Veritas Kanban as the source of truth for task state.
2. Before implementation, inspect the task, acceptance criteria, worktree, and related docs.
3. Move the task to `in-progress` and ensure an attempt is tracked.
4. Keep notes in task comments or progress files when findings affect future work.
5. Run relevant tests/checks before completion.
6. Report final summary, files changed, tests run, risks, and follow-ups.
7. For code changes, request cross-model review before final completion.
8. Use the Veritas MCP server when available instead of ad hoc HTTP calls.

For OpenAI product/API questions, use the OpenAI developer documentation MCP server first.

Telemetry Mapping

Codex SignalVeritas Destination
Thread startedAttempt metadata
Turn started/completedAttempt status + run duration
Agent messageAttempt log
Command executionAttempt log + activity event
File changeAttempt log + possible deliverable
MCP tool callAttempt log + trace
Final responseCompletion summary
Usage tokensrun.tokens telemetry
Error/failed turn/processFailed attempt + failure alert

If autoTelemetry is enabled, avoid double-emitting lifecycle events. Token usage should still be reported when Codex provides usage data.


Review Rules

AuthorReviewer Recommendation
CodexClaude, Gemini, or another non-Codex reviewer
ClaudeCodex review or GPT-family reviewer
HumanCodex review for complex code or high-risk changes
Codex reviewHuman adjudicates blocking findings

Follow SOP-cross-model-code-review.md for scoring, findings, and final gate handling.


Workflow Engine Rules

Codex workflow steps should:

  • run through the provider abstraction
  • receive rendered workflow context and progress notes
  • write real step outputs
  • respect configured concurrency limits
  • fail visibly with retryable error metadata
  • keep placeholder execution only for test/mock mode

Example step:

steps:
  - id: implement
    type: agent
    agent: codex
    input: |
      Implement {{ task.title }} in the task worktree.
      Acceptance criteria:
      {{ task.acceptanceCriteria }}

Escalation

ScenarioAction
Codex auth unavailableMark attempt failed with setup guidance; do not retry blindly
Codex command exits non-zeroPreserve stderr/JSONL and create failure alert
Codex changes files outside worktreeStop attempt and flag for human review
Codex reports ambiguous completionLeave task in in-progress and request clarification
Review finds blocking issueCreate fix subtasks and keep original task blocked
Cloud delegation produces stale PRSync GitHub status and create local follow-up task

Release QA

Before v4.3 ships:

  • Run one mocked CLI provider success case in CI.
  • Run one mocked CLI provider failure case in CI.
  • Run one real local Codex code task manually.
  • Run one Codex review manually.
  • Run one workflow-engine Codex step manually.
  • Verify Settings detects install/auth state.
  • Verify attempt logs, telemetry, and final summaries render correctly.