Real-World Agent Workflow Examples
March 26, 2026 · View on GitHub
Steal these end-to-end flows when building your own automations. Each example shows the goal, prompts, API/CLI calls, and outputs we expect.
1. Feature Development Sprint (BrainMeld PRD excerpt)
Goal: Build "Lessons Learned" field.
- Create task
vk create "Feature: Lessons Learned field" --project veritas-kanban --type feature --priority medium - Prompt (worker)
Implement markdown lessonsLearned field on tasks (UI + API). Include migration + docs. Cross-model review required. - Workflow
vk begin <id>- Implement server -> shared -> web changes
- Update docs + tests
vk done <id> "Added lessons learned field"
- Outputs
- Task summary with PR link
- Lessons Learned comment describing future usage
2. Bug Fix (Archive bulk action)
Goal: Sprint archive button fails.
- Create bug task referencing GitHub Issue #86.
- Subtasks:
- Reproduce in dev
- Inspect network requests
- Patch bulk archive handler
- Add regression test (Playwright)
- CLI flow:
vk begin, fix,vk done "Bulk archive now calls API" - Cross-model review ensures UI + API parity.
3. Documentation Update
Goal: Add sanity checks to Getting Started.
- Task description includes sections to cover (API, UI, agent pickup).
- Agent edits
docs/GETTING-STARTED.md+docs/TROUBLESHOOTING.mdreferences. - Completion summary links to diff + screenshot placeholders.
4. Security Audit (RF-002 style)
Goal: Run cross-model audit on repo.
- Task ->
type=security,project=veritas-kanban. - Subtasks: scope, run Codex audit, run Claude review, compile findings, create issues.
- Agents spawn using research prompt template, save results to
refactoring/rf-002/*. - Deliverables: Markdown report, HTML deck, GitHub issues.
5. Content Production (Podcast clip → LinkedIn post)
- Task
type=contentwith acceptance criteria (summary, caption, schedule time). - Agent fetches transcript, writes summary, drafts LinkedIn copy, saves assets to
projects/start-small-think-big/.... - Completion summary includes copy + asset path; lessons learned capture platform insights.
6. Research & Report (Champions)
- Task
type=research, projectsocial, sprintCHAMP-02. - Prompt includes dossier template, required sources, HTML deck requirement.
- Agent workflow: gather sources, write Markdown, generate HTML via script,
brain-write.shto mirror. - Final comment: TL;DR + links to both artifacts.
Pattern to Copy
For any workflow:
- Task with crystal-clear done definition.
- Prompt stored in registry.
- API/CLI calls scripted (vk begin/done, time tracking, status updates).
- Artifacts saved to predictable paths and mirrored to Brain/engram if needed.
- Cross-model review if code/critical.
- Lessons learned field updated for systemic knowledge.
Use these recipes as seeds for your own automation playbooks.
7. Workflow Engine Pipeline
Goal: Automate plan → implement → test → review with retry policies.
- Create
.veritas-kanban/workflows/feature-dev.ymlwith planner, developer, and tester agents. - Start via API:
POST /api/workflows/feature-dev/runs - Monitor live in the Workflows tab — each step shows status, duration, and output preview.
- Gate steps block until quality checks pass or a human approves.
See WORKFLOW-GUIDE.md for full YAML examples.
8. Using Task Dependencies
Goal: Ensure backend API is complete before frontend work starts.
- Create
US-100 "Build REST API"andUS-101 "Build React UI". - Set dependency:
US-101depends_onUS-100. - The dependency badge on
US-101shows it's blocked untilUS-100is done. - Query the full graph:
GET /api/tasks/US-101/dependencies
9. Crash-Recovery Checkpointing
Goal: Resume long-running agent work after a crash.
# Save checkpoint mid-work
curl -X POST http://localhost:3001/api/tasks/US-42/checkpoint \
-H "Content-Type: application/json" \
-d '{"state":{"step":3,"completed":["auth","db"],"notes":"Working on API layer"}}'
# After restart, resume from checkpoint
CHECKPOINT=$(curl -s http://localhost:3001/api/tasks/US-42/checkpoint)
# Feed $CHECKPOINT into agent prompt for continuity
# Clean up after completion
curl -X DELETE http://localhost:3001/api/tasks/US-42/checkpoint
10. Observational Memory for Cross-Agent Learning
Goal: Capture architectural decisions so future agents don't repeat exploration.
# Log a decision
curl -X POST http://localhost:3001/api/observations \
-H "Content-Type: application/json" \
-d '{"taskId":"US-42","type":"decision","content":"Chose WebSocket over SSE for real-time updates — lower latency, bidirectional","importance":9}'
# Future agent searches before making the same decision
curl "http://localhost:3001/api/observations/search?query=websocket+vs+sse"
11. Agent Policy Evaluation (v4.0)
Goal: Restrict an agent from deleting production tasks.
# Create a deny-first policy
curl -X POST http://localhost:3001/api/policies \
-H "Content-Type: application/json" \
-d '{
"name": "no-delete-production",
"description": "Prevent agents from deleting tasks in production projects",
"scope": {"project": "production"},
"rules": [{"tool": "task.delete", "action": "deny", "reason": "Production tasks require human approval for deletion"}],
"precedence": "deny-first"
}'
# Test before deploying
curl -X POST http://localhost:3001/api/policies/POLICY_ID/evaluate \
-H "Content-Type: application/json" \
-d '{"agent": "codex-1", "tool": "task.delete", "context": {"project": "production"}}'
12. Behavioral Drift Monitoring (v4.0)
Goal: Detect when an agent's task completion rate drops.
# Configure a drift monitor
curl -X POST http://localhost:3001/api/drift \
-H "Content-Type: application/json" \
-d '{
"agent": "TARS",
"metric": "completion_rate",
"baseline": 0.85,
"warningThreshold": 0.70,
"alertThreshold": 0.50
}'
# Check drift status across all agents
curl -s http://localhost:3001/api/drift | jq '.data[] | {agent, metric, status}'
13. Decision Audit Trail (v4.0)
Goal: Log a significant architectural decision with assumptions for future reference.
# Record a decision
curl -X POST http://localhost:3001/api/decisions \
-H "Content-Type: application/json" \
-d '{
"taskId": "US-200",
"agent": "VERITAS",
"decision": "Use file-based storage instead of SQLite for v4.0",
"confidence": 0.8,
"evidence": ["Current scale is <1000 tasks", "File ops are simpler to debug", "No migration path needed"],
"assumptions": ["Scale stays under 10k tasks", "Single-instance deployment"]
}'
# Later: record what happened
curl -X POST http://localhost:3001/api/decisions/DECISION_ID/outcome \
-H "Content-Type: application/json" \
-d '{"outcome": "File storage held up well through v4.0 launch. Assumption about scale still valid at ~350 tasks."}'