PLAN: vtcode Loop Engineering
June 26, 2026 · View on GitHub
Status: All phases complete (A, B, C, D, E, F)
Source: Addy Osmani, Loop Engineering — https://addyosmani.com/blog/loop-engineering/
Workspace: /Users/vinhnguyenxuan/Developer/learn-by-doing/vtcode
Related: .vtcode/plans/core-agent-loop-exploration.md (existing loop architecture exploration)
1. Why this plan exists
The article reframes the developer's relationship to coding agents:
"Loop engineering is replacing yourself as the person who prompts the agent. You design the system that does it instead." — Addy Osmani
Two practitioner quotes anchor the argument:
- Peter Steinberger: "You shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents."
- Boris Cherny (Anthropic, head of Claude Code): "I don't prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops."
vtcode-core already ships a partial version of this: a single agent running inside a single harness. The article argues that the next layer up is the real design target. This plan converts the existing single-agent harness into a system that can be run by an external loop — and adds the primitives a loop will need.
2. The article's checklist mapped to vtcode
| # | Primitive | What it is | Current vtcode state | Gap |
|---|---|---|---|---|
| 1 | Automations | Scheduled triggers for discovery & triage | automation.full_auto config exists; full-auto mode exists in the harness | No actual scheduler consuming full_auto — scheduler/ exists (cron, durable tasks, session tasks) but is not wired to a loop tick |
| 2 | Worktrees | Isolation per parallel agent | Plain Git workspace only; no worktree abstraction | vtcode-core cannot run two loops concurrently without colliding on the working tree |
| 3 | Skills | Explicit project knowledge so the agent doesn't guess | AGENTS.md is read; skills are listed in the runtime; skills/ module is extensive (discovery, loader, manager, streaming, validation, native plugins) | Skills are not yet a first-class runtime input wired into the agent's context assembly as a structured slice |
| 4 | Plugins / connectors | Wiring the agent into existing tools | Tool registry exists; many tools ship in-tree; native_plugin in skills crate | Connectors to external systems (Linear, GitHub, Notion) are not in vtcode-core |
| 5 | Sub-agents | Separation of propose vs. verify | subagents/ module exists with background, config, discovery, executor, model, prompt, types — supports spawning child agents, background subprocesses, delegation | No explicit verifier-agent path for propose vs. verify; no spawn-isolate-return contract that guarantees the child's output is validated before merge |
| 6 | Memory | Markdown / Linear / external state that survives the run | persistent_memory.rs exists with MEMORY.md, memory_summary.md, notes/, rollout_summaries/ — full LLM-backed fact classification and consolidation | Loop-level state (current step, attempts, cost) is not persisted; "the agent forgets, the repo doesn't" — but the loop has no place to store its own run state |
3. Goals (in order of priority)
- Make the harness loop-runnable. A loop is a long-lived scheduler; vtcode must be safe to invoke repeatedly from a scheduler without state leaks between runs.
- Add worktree isolation so two parallel loops cannot corrupt each other.
- Promote skills to a first-class runtime input (loaded from disk, not just listed).
- Introduce a sub-agent contract for propose vs. verify, with a clear spawn -> run -> return-isolated-result API.
- Define the on-disk memory layout for loop state and what the agent must write before each run ends.
- Token-cost guardrails — the article's most concrete warning. The loop's failure mode is unbounded iteration cost, not model quality.
- Tool-agnostic design — the loop sits above the harness and must outlive any one agent product. Do not couple vtcode internals to a specific model.
4. Non-goals
- We are not building a competing Claude Code or Codex. vtcode-core is the harness; the loop is one layer above.
- We are not removing the single-agent interactive path. It stays as the human-in-the-loop mode.
- We are not adding a network scheduler (cron, Temporal, etc.) into vtcode-core. The loop is a consumer of vtcode; vtcode does not own the loop.
5. Architectural decisions
5.1 Layering
[ External Loop / Scheduler ] <-- cron, CI, or a future "vtcode-loop" crate
| <-- invokes vtcode as a subprocess / library
v
[ vtcode-core CLI / library ] <-- the harness (this repo)
|
v
[ Provider SDK ] <-- model client (OpenAI, Anthropic, etc.)
The loop is a consumer. Do not add a top-level harness subsystem in vtcode-core. Keep agent.harness, automation.full_auto, and context.dynamic as the configuration surfaces (per AGENTS.md).
5.2 Event contract
vtcode-exec-events::ThreadEvent is the authoritative runtime event contract. The loop consumes this stream; do not invent a parallel event type for the loop layer.
5.3 Memory layout (proposed)
.vtcode/
state/
loop-<id>.json # per-loop run state (current step, attempts, cost)
notes.md # agent-written, human-readable, append-only
decisions.md # decisions the agent made that should survive
plans/ # existing — loop inputs
worktrees/ # isolated worktrees per parallel loop
notes.mdis the "memory" primitive from the article — written by the agent, read by the agent on the next iteration.loop-<id>.jsonis the durable run state the loop scheduler reads on resume.- The existing
persistent_memory.rsmodule already handles MEMORY.md, memory_summary.md, preferences.md, repository-facts.md, notes/, and rollout_summaries/ under~/.vtcode/projects/<project>/memory/. The loop state layer sits alongside (not replaces) this.
5.4 Sub-agent contract
spawn(ChildAgentSpec) -> ChildHandle
ChildHandle.run(input) -> ChildResult
ChildResult { ok, output, artifacts: Vec<PathBuf>, cost: TokenUsage }
The child runs in a worktree, has a scoped skill set, and returns artifacts by path — never by in-process mutation of the parent. This is the propose/verify separation from the article.
5.5 Skills as runtime input
Today skills are listed in list_skills and load_skill. Promote them to a runtime input:
load_skill(name)returns a typedSkillobject.- The agent's context assembly step takes a
&[Skill]slice. AGENTS.mdat the workspace root is auto-loaded as the implicit "workspace skill."- No skill ever mutates global state on load.
6. Implementation steps
Each step is independently shippable behind a feature flag; do not bundle.
Phase A — Make the harness loop-safe (3-5 days)
| # | Step | Files (expected) | Verify |
|---|---|---|---|
| A1 | Audit vtcode-core for state that persists across CLI invocations (caches, temp dirs, global singletons) and make it per-invocation | vtcode-core/src/lib.rs, any OnceLock/LazyLock usages, config/, session/ | cargo check --locked; unit test that runs two sequential agent loops in one process and asserts no cross-contamination |
| A2 | Add a LoopRunState struct that captures (step index, cumulative cost, last artifact path, status) and serializes to loop-<id>.json | New vtcode-core/src/loop_state.rs | Round-trip serde test; integration test that writes state, kills process, reloads, asserts continuity |
| A3 | Gate the existing SessionScheduler (scheduler/mod.rs) behind a loop-aware config surface so a loop can drive tick events without conflicting with the durable task scheduler | vtcode-core/src/scheduler/mod.rs, config types | Unit test that a loop tick and a cron tick do not collide |
Status: Complete
- A1 audit: 36 global statics found in vtcode-core. All safe in CLI mode (process exits between runs). MEDIUM risk items (FILE_CACHE, COMMAND_CACHE, TRANSCRIPT, rate limiters, circuit breaker) only matter for library/REPL usage where multiple sessions share a process. No code changes needed — the harness is per-invocation by design.
- A2:
LoopRunStatewith JSON persistence — done. - A3:
LoopEngineConfigadded toAutomationConfigwithenabled,max_iterations,reconcile_on_complete,preload_skillsfields. Gated vialoop_engine_enabled()withVTCODE_DISABLE_LOOP_ENGINEenv override.
Phase B — Worktree isolation (2-3 days)
| # | Step | Files (expected) | Verify |
|---|---|---|---|
| B1 | Add a WorktreeManager module exposing create(), list(), remove() wrapping git worktree | New vtcode-core/src/git/worktree.rs (+ new git/mod.rs module) | Unit test for create/list/remove against a temp git repo |
| B2 | Add a WorktreeReconciler that runs a verifier sub-agent on the worktree diff before merge-back | vtcode-core/src/git/worktree.rs (extended) | Integration test: create worktree, make change, reconcile, assert merge |
| B3 | Wire worktree creation into SubagentController::spawn_with_spec() when isolation == "worktree" (currently returns an error at line 1471 of subagents/mod.rs) | vtcode-core/src/subagents/mod.rs | Existing test suite passes; new test for worktree-spawned subagent |
Status: Complete (B1 done with WorktreeManager; B3 done — isolation == "worktree" now creates a worktree under .vtcode/worktrees/; B2 done — WorktreeReconciler with reconcile() method using synchronous verify closure in spawn_blocking, integrated into SubagentController::launch_child)
Phase C — Skills as first-class input (1-2 days)
| # | Step | Files (expected) | Verify |
|---|---|---|---|
| C1 | Add Skills to the harness context builder (alongside AGENTS.md) | vtcode-core/src/skills/loader.rs, vtcode-core/src/skills/manager.rs | Integration test that loads a named skill and asserts it appears in the system prompt |
| C2 | Resolve skills lazily by name from the tool registry; fail loud on missing skills | vtcode-core/src/skills/executor.rs | Unit test for missing-skill error path |
Status: Complete (C1 done — SkillsManager::loop_skills() loads skills by name from LoopEngineConfig.preload_skills; C2 done — SkillsManager::resolve_skill_by_name() returns Err with available-skill list on miss)
Phase D — Sub-agent verifier (2-3 days)
| # | Step | Files (expected) | Verify |
|---|---|---|---|
| D1 | For mutating tool calls (write_file, edit_file, shell commands that touch files), run a second agent pass that re-reads the diff and either approves or rejects | vtcode-core/src/subagents/ (new verifier module) | Unit test with a known-bad edit; assert rejection + retry |
| D2 | Verifier runs in a fresh context (no proposer bias); on rejection, the loop retries up to N times | Config: automation.full_auto.verify_mutations (default off) | Integration test: proposer makes bad edit, verifier rejects, loop retries, second attempt passes |
| D3 | Gate the verifier behind automation.full_auto.verify_mutations to control cost doubling | Config types | Config test: default off, explicitly enabled |
Status: Complete (D1/D2 — verify_proposed_change() on SubagentController spawns a read-only verifier via spawn_custom; D3 — verify_mutations: bool field added to FullAutoConfig, default false)
Phase E — Loop memory on disk (1-2 days)
| # | Step | Files (expected) | Verify |
|---|---|---|---|
| E1 | Define LoopMemoryStore trait with read_notes(), write_note(), read_decisions(), write_decision() | New vtcode-core/src/loop_memory.rs | Round-trip test (write -> reload -> assert equal) |
| E2 | Default implementation: markdown files in .vtcode/state/ | vtcode-core/src/loop_memory.rs | Filesystem integration test |
| E3 | Feature-flagged sqlite implementation for faster queries | vtcode-core/src/loop_memory.rs (behind sqlite feature) | Same round-trip test against sqlite |
Status: Complete (E1/E2 done — MarkdownLoopMemory with notes.md and decisions.md under .vtcode/state/; E3 done — SqliteLoopMemory behind sqlite feature flag with 5 round-trip tests)
Phase F — Token-cost guardrails (1 day)
| # | Step | Files (expected) | Verify |
|---|---|---|---|
| F1 | Add a CostBudget struct that tracks cumulative token spend per loop run and rejects further agent calls when the budget is exceeded | vtcode-core/src/loop_state.rs (extended) | Unit test: set budget, exhaust it, assert next call is rejected |
| F2 | Expose cost budget in LoopRunState so the loop scheduler can read it on resume | vtcode-core/src/loop_state.rs | Serialization test |
Status: Complete (F1/F2 done — CostBudget with BudgetStatus enum tracks token/cost/step limits, integrated into LoopRunState)
7. Verification plan
cargo check --lockedclean.cargo test --lockedgreen, with new tests for scheduler, worktree, memory store, cost budget.- A short docs note (
docs/loop-engineering.md) explaining the five-primitive mapping above so future contributors can see why each module exists. — Done. - No new dependencies; everything reachable from existing workspace crates.
Verification results (2026-06-26)
cargo check --locked— clean (no errors, no new warnings)cargo test -p vtcode-core --lib -- loop_state— 14 tests passcargo test -p vtcode-core --lib -- loop_memory— 6 tests pass (default features); 11 tests with--features sqlite(5 sqlite tests are feature-gated)cargo clippy -p vtcode-core— clean (pre-existing warnings invtcode-commonsonly)- New modules:
vtcode-core/src/loop_state.rs,vtcode-core/src/loop_memory.rs,vtcode-core/src/git/worktree.rs,vtcode-core/src/git/mod.rs - Modified:
vtcode-core/src/subagents/mod.rs(worktree isolation + verifier),vtcode-core/src/subagents/types.rs(worktree_path field),vtcode-config/src/core/automation.rs(verify_mutations),vtcode-core/src/lib.rs(module declarations)
8. Risks
- Token cost: Osmani flags this as the primary failure mode. The verifier sub-agent doubles cost on mutating calls. Mitigation: gate the verifier behind
automation.full_auto.verify_mutations(default off). - Worktree churn: Rapid loops may create many worktrees. Mitigate with an LRU cap and explicit GC tick.
- Skills drift: Skills change between loop iterations. Mitigate by hashing skills into the context and re-resolving only on change.
9. Open decisions
| Decision | Default | Alternative | Rationale |
|---|---|---|---|
| Worktree backend | git worktree (simple, requires git) | Filesystem snapshot (no git, but heavier) | Default: git worktree, with a feature flag for snapshots |
| Memory store default | Markdown (current) | SQLite (faster queries) | Default: markdown, sqlite behind a feature flag |
| Scheduler driver | tokio interval | External cron | Default: tokio interval, since the harness already runs on tokio |
| Verifier model | Same as proposer | Smaller/cheaper model | Default: same model; feature flag for cheaper verifier |
10. Upstream Rig follow-up (wrapper removal checklist)
Status: Blocked on upstream Rig PRs Tracking: https://github.com/vinhnx/VTCode/issues/678, PR #679
VTCode maintains several wrappers and fallbacks that exist solely because rig-core 0.39 lacks certain features. When the upstream Rig PRs below land and VTCode updates to a compatible Rig version, the corresponding wrappers should be removed.
Upstream Rig PRs
- https://github.com/0xPlaygrounds/rig/pull/1855 —
Output::Unknowncarriesserde_json::Value(preserves hosted tool output items). Status: open, needs rebase. - https://github.com/0xPlaygrounds/rig/pull/1830 —
prompt_cache_keyandprompt_cache_retentionpass-through viaadditional_params. Status: open.
When rig#1855 lands (Output::Unknown carries Value)
-
vtcode-core/src/llm/providers/openai/responses_adapter.rs: Removeadapt_rig_gap_output_item_envelopeandProviderValueBearingRigGapfallback (lines 572-605) -
vtcode-llm/src/providers/shared/responses_stream.rs: Evaluate ifDocumentedValueBearingRigGapevents can now flow through Rig's typed API instead of being silently dropped
When rig#1830 lands (prompt_cache_key/retention pass-through)
-
vtcode-core/src/llm/providers/openai/responses_adapter.rs: RemovePromptCacheOverlaystruct andapply_prompt_cache_overlayfunction (lines 50-56, 361-383) -
vtcode-llm/src/providers/openai/request_builder.rs: Remove duplicateapply_prompt_cache_overlayfunction (lines 214-233) and its call site (lines 775-781) - Update prompt-cache JSON-boundary tests to assert Rig-owned serialisation
When Rig exposes ChatGPT session refresh hooks
-
vtcode-llm/src/providers/openai/backend_setup.rs: RemoveChatGptSubscriptionAuthSource::CodexAppServerCompatibility(line 49) - Remove
from_rig_chatgpt_authconversion (lines 109-123) if Rig handles session refresh directly
When Rig exposes transport capability parity
-
vtcode-llm/src/providers/openai/backend_setup.rs: RemoveOpenAIBackendTransportCapabilitiesgating (lines 60-68)