VT Code Async Performance Audit
June 14, 2026 · View on GitHub
Note (2026-06): The cache, middleware, patterns, executor, and optimizer modules referenced below have been merged into
vtcode-core::toolsas part of the crate consolidation. File paths in this document reflect their pre-consolidation locations.
Date: 2026-03-04
Scope: Runtime-critical paths first (vtcode-core, vtcode-bash-runner)
Audit Rubric
Each module was reviewed for:
- Blocking calls on async/runtime threads (
std::thread::sleep, blocking I/O in async paths) - Awaiting external work while holding locks
- Cancellation and timeout propagation behavior
- Scheduling fairness hazards (
select!/join!use and long critical sections) - Sync primitives in async-facing hot paths
Findings (Prioritized)
Critical
- Awaiting observer hooks while cache write locks are held
- File:
vtcode-core/src/tools/registry/cache.rs - Impact: lock contention amplification, potential stall chains under load
- Status: fixed in this batch
High
- Async-facing notification manager used std sync locks in hot path
- File:
vtcode-core/src/notifications/mod.rs - Impact: unnecessary poisoning/recovery branches and slower lock path
- Status: improved in this batch (
parking_lot::{RwLock, Mutex})
Medium (Deferred)
- Graceful process-group termination uses polling sleeps in synchronous loop
- File:
vtcode-bash-runner/src/process_group.rs - Note: currently called from synchronous PTY cleanup paths and
spawn_blockingpaths, so runtime risk is lower than the above critical/high items - Deferred action: evaluate async-aware termination path only where call sites are async-sensitive
- Deprecated synchronous retry middleware uses blocking sleep
- File:
vtcode-core/src/tools/middleware.rs - Note: type is marked deprecated in favor of
AsyncRetryMiddleware - Deferred action: keep behavior stable; avoid churn unless deprecated path is removed or reactivated in runtime-critical paths
Implemented Batch (Runtime-Critical)
1) Cache lock/await safety remediation
Updated vtcode-core/src/tools/registry/cache.rs:
insert_arc:- removed
awaitwhileentries/access_orderlocks are held - moved observer eviction callback and stats update after lock release
- removed
remove:- release locks before awaiting
observer.on_evict
- release locks before awaiting
prune_expired:- collect/remove expired keys under lock
- release locks
- then run observer callbacks
Outcome: no external async callback is awaited while cache write locks are held.
2) Notification lock path optimization
Updated vtcode-core/src/notifications/mod.rs:
- switched lock types from
std::sync::{RwLock, Mutex}toparking_lot::{RwLock, Mutex} - removed poisoning recovery branches (not applicable to parking_lot)
- preserved public behavior and API surface
Outcome: lower-overhead lock path and simpler critical sections in notification flow.
3) KISS/DRY follow-up pass
Updated vtcode-core/src/tools/registry/cache.rs:
- simplified
insert_arclock scope using a single inner block (removed explicitdrop(...)) - emit manual-eviction observer events only when an entry was actually removed
- added early-return in
prune_expiredfor empty expired set
Updated vtcode-core/src/notifications/mod.rs:
- async config methods now delegate to sync methods (
update_config->update_config_sync,get_config->get_config_sync) to remove duplicated lock logic
4) Async-safe process termination helper + cache lock scope tightening
Updated vtcode-bash-runner/src/process_group.rs and vtcode-bash-runner/src/lib.rs:
- added
graceful_kill_process_group_default_async(pid)that runs graceful kill inspawn_blocking - exported the async helper from crate root
- added async unit test for nonexistent PID behavior
Updated src/agent/runloop/unified/tool_pipeline/execution_runtime.rs:
- moved JSON serialization before write-lock acquisition in success-cache path
- keeps lock hold time minimal and avoids extra work inside critical section
5) Async runtime safety for PTY session close path
Updated vtcode-core/src/tools/registry/executors.rs:
execute_close_pty_sessionnow executesPtyManager::close_sessionintokio::task::spawn_blocking- prevents synchronous PTY shutdown/wait logic from running on async runtime worker threads
- keeps error propagation and response shape unchanged
6) Final KISS/DRY + hot-path cleanup pass
Updated vtcode-core/src/tools/registry/cache.rs:
insert_arcnow removes prior key occurrence from LRU order before re-inserting- avoids duplicate key entries in access queue and keeps eviction order tight
prune_expirednow:- uses
entries.retain(...)to collect and remove expired entries in one pass - prunes access-order with a single
retain(...)using a set of expired keys - reduces repeated
retainscans and simplifies flow
- uses
Updated vtcode-bash-runner/src/process_group.rs:
- removed duplicate cfg-specific
graceful_kill_process_group_defaultwrappers - kept one unified default wrapper calling cfg-specific
graceful_kill_process_group
7) Async-safe PTY bulk termination in runloop timeout/guard paths
Updated vtcode-core/src/tools/registry/pty.rs and vtcode-core/src/tools/registry/pty_facade.rs:
- added
PtySessionManager::terminate_all_async()usingtokio::task::spawn_blocking - added
ToolRegistry::terminate_all_pty_sessions_async()facade method - preserved existing synchronous methods for compatibility
Updated async call sites:
src/agent/runloop/unified/turn/session_loop_runner.rssrc/agent/runloop/unified/turn/tool_outcomes/handlers.rs
Changes:
- replaced direct
terminate_all_pty_sessions()calls in async paths with awaited async-safe variant - added warning logs when blocking-pool join/termination fails
- ensured UI status cleanup still executes in the same flow
8) Cancellation-safety cleanup for UI redraw auto-flush task
Updated src/agent/runloop/unified/turn/utils.rs:
- added
DropforUIRedrawBatcherthat abortsauto_flush_taskwhen batcher is dropped - prevents leaked background auto-flush task from outliving session/UI lifetime
- keeps implementation minimal (no behavior changes to redraw batching while active)
9) Runloop task lifecycle tightening (signal + progress updaters)
Updated src/agent/runloop/unified/session_setup/signal.rs:
- replaced raw
JoinHandle<()>return with RAIISignalHandlerGuard SignalHandlerGuardaborts the background signal-listener task on drop- keeps existing cancel-token behavior and call-site usage unchanged
Updated src/agent/runloop/unified/progress.rs:
- elapsed-time updater now exits once
ProgressState::is_complete()is true - avoids unnecessary periodic wakeups after completion even before guard drop/abort
10) UI redraw state correctness fix (KISS)
Updated src/agent/runloop/unified/turn/utils.rs:
- fixed
UIRedrawBatcher::force_redraw()to actually reset batching state:- set
pending_redrawsto0(best-efforttry_lock) - update
last_redraw_timetoInstant::now()(best-efforttry_lock)
- set
- avoids stale pending state after forced redraws
11) Background task ownership for file palette indexing
Updated:
src/agent/runloop/unified/session_setup/types.rssrc/agent/runloop/unified/session_setup/ui.rssrc/agent/runloop/unified/turn/session_loop_runner.rs
Changes:
- added
BackgroundTaskGuard(abort-on-drop) for session-scoped background tasks - wrapped file-palette indexing
tokio::spawninBackgroundTaskGuard - stored guard in
SessionUISetupand retained it through session loop lifetime - prevents indexing task from outliving session teardown
12) Duplicate MCP initialization spawn guard
Updated src/agent/runloop/unified/async_mcp_manager.rs:
start_initialization()now returns early when an existing init task is still running- avoids spawning duplicate background init tasks and detaching older handles
- added focused unit test:
test_start_initialization_skips_when_task_already_running
13) Detached approval-pattern writes: explicit bounded policy + DRY
Updated src/agent/runloop/unified/tool_routing.rs:
- introduced
spawn_approval_record_task(...)helper for approval-pattern writes - centralized timeout bound (
APPROVAL_RECORD_TIMEOUT = 500ms) - kept these tasks intentionally detached because they are non-critical side effects
- added debug logs for timeout/write errors so detached failures are observable
14) Cancellation hardening for PTY stream runtime drop-path
Updated src/agent/runloop/unified/tool_pipeline/pty_stream.rs:
- added
DropforPtyStreamRuntimethat:- marks stream inactive
- drops sender
- aborts background render task if still present
- ensures no background PTY stream task leaks if the execution future is cancelled before explicit
shutdown().await - added focused unit test:
pty_stream_runtime_drop_aborts_background_task
15) Cancellation-safe progress callback restoration in tool execution runtime
Updated src/agent/runloop/unified/tool_pipeline/execution_runtime.rs:
- added RAII
ProgressCallbackGuardfor temporary PTY progress callback overrides - guarantees callback restoration on normal return and on future cancellation/drop
- removed manual post-await restoration path in favor of drop-based restoration
- added focused unit test
progress_callback_guard_restores_previous_on_drop
16) Async state-machine bloat patterns (Tweede Golf, May 2026)
Reference: https://tweedegolf.nl/en/blog/237/async-rust-never-left-the-mvp-state Upstream Project Goal: https://rust-lang.github.io/rust-project-goals/2026/async-statemachine-optimisation.html
The article identifies four sources of bloat in the futures that rustc generates today, all rooted in the MIR coroutine_resume lowering:
- The
Returnedstate always panics on re-poll (overhead even when callers are well-behaved). - Async blocks with no
.awaitstill receive a 3-state machine and discriminant switch. - Pure-delegation futures (
async fn bar() { foo(blah).await }) are not inlined;bargets its own state machine that wrapsfoo's. matcharms that each end in.awaitproduce one duplicated suspend state per arm even when the saved type is identical.
The article's measured wins (2-5% binary size on embedded; ~3% perf on x86 with smol) are from compiler-side hacks. Source-level workarounds are explicitly characterized as ugly noise the compiler should obviate. We therefore adopt the following policy rather than mass rewrites:
Policy
- Track the upstream Project Goal in this audit; revisit on each
rustuptoolchain bump. - For NEW code in
vtcode-coreruntime hot paths:- Do not write
async fnfor a body that contains no.awaitunless required by a trait signature; use a plainfninstead. - For pure single-step delegations (
async fn x(a) { y(a).await }) on free or inherent functions, preferfn x(a) -> impl Future<Output = T> + use<'_, ...>so the wrapper state machine is elided. Do not apply this to#[async_trait]impls, ACP/Codex protocol handlers, or any caller that spawns the future on a multi-thread runtime whereSendinference would regress. - When a
matchchooses between calls of the same async fn that differ only in arguments (article's "Collapsing states" example), hoist the differing argument into aletand.awaitonce after the match.
- Do not write
- Do not rewrite existing code purely for these patterns. The compiler fix is the right intervention; opportunistic rewrites are acceptable when a file is already being edited for another reason.
Scan summary (snapshot)
A scoped scan (async fn whose body contains no .await, excluding #[test]/#[tokio::test]/async_trait macros) found 214 candidates. The vast majority are trait method implementations whose async keyword is mandated by the trait signature and cannot be removed. Genuine pure-delegation candidates (single inner .await, free or inherent fn, not a trait impl) cluster in:
- vtcode-core/src/tools/registry/builder.rs — five
ToolRegistry::new*constructors all delegate toSelf::build_with_policy(...).await. - vtcode-core/src/tools/registry/pty_facade.rs —
terminate_all_pty_sessions_async,terminate_all_exec_sessions_async. - vtcode-core/src/tools/registry/harness_facade.rs —
harness_exec_session_completed,terminate_harness_exec_session. - vtcode-core/src/tools/registry/file_helpers.rs —
read_file,write_file,create_file,delete_filethin wrappers aroundexecute_tool. - vtcode-core/src/tools/registry/execution_facade.rs —
execute_public_tool_request,execute_tool. - vtcode-core/src/tools/cache.rs —
put_file,put_directoryarc wrappers. - vtcode-core/src/llm/providers/openai/provider/provider_impl.rs —
stream,stream_normalized,generatedelegations. - vtcode-core/src/project_doc.rs —
get_user_instructions,build_instruction_appendix. - vtcode-core/src/tools/file_ops/path_policy.rs —
normalize_user_path. - vtcode-core/src/tools/file_ops/write.rs —
write_filewrapper aroundwrite_file_internal.
Pattern 4 ("Collapsing states") yielded one match in vtcode-core/src/mcp/cli.rs:164, but each arm calls a different function (run_list, run_get, run_add, …) so the saved suspend types differ and there is nothing to collapse. No source change is warranted.
Status
- Documented and applied opportunistically in touched runtime files (per policy above), focused on pure-delegation wrappers in
vtcode-core. - Re-scan after the next
rust-toolchain.tomlbump to see how many candidates the upstreamcoroutine_resumework has obsoleted. - If the upstream Project Goal lands an
unwind = abort/panic = abortswitch that drops thePanickedstate, evaluate whetherrelease-profile-strictshould opt in for binary-size sensitive embeds (tracked here, not actioned).
Validation
Executed:
cargo test -p vtcode-core notifications::tests:: -- --nocapturecargo check -p vtcode-coreRUSTC_WRAPPER= cargo test -p vtcode-bash-runner graceful_kill -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode-bash-runner graceful_kill -- --nocapture(re-run after process-group cleanup)RUSTC_WRAPPER= cargo check -p vtcode-bash-runnerRUSTC_WRAPPER= cargo check -p vtcodeRUSTC_WRAPPER= cargo check -p vtcode-core(after async PTY bulk termination migration)RUSTC_WRAPPER= cargo check -p vtcode(after async PTY bulk termination migration)RUSTC_WRAPPER= cargo check -p vtcode(after redraw batcher cancellation fix)RUSTC_WRAPPER= cargo check -p vtcode(after signal/progress lifecycle tightening)RUSTC_WRAPPER= cargo test -p vtcode-core pty_test -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode-core pty_tests -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode turn::utils -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode progress::tests -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode turn::utils -- --nocapture(re-run afterforce_redrawfix)RUSTC_WRAPPER= cargo test -p vtcode --bin vtcode session_setup -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode async_mcp_manager::tests -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode async_mcp_manager::tests -- --nocapture(re-run after duplicate-init guard test)RUSTC_WRAPPER= cargo test -p vtcode --bin vtcode tool_routing -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode tool_pipeline -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode tool_pipeline::execution_runtime -- --nocaptureRUSTC_WRAPPER= cargo test -p vtcode --bin vtcode tool_pipeline::pty_stream -- --nocapturerustfmt --check vtcode-core/src/tools/registry/pty.rs vtcode-core/src/tools/registry/pty_facade.rs src/agent/runloop/unified/turn/session_loop_runner.rs src/agent/runloop/unified/turn/tool_outcomes/handlers.rsrustfmt --check src/agent/runloop/unified/turn/utils.rsrustfmt --check src/agent/runloop/unified/session_setup/signal.rs src/agent/runloop/unified/progress.rsrustfmt --check src/agent/runloop/unified/session_setup/types.rs src/agent/runloop/unified/session_setup/ui.rs src/agent/runloop/unified/async_mcp_manager.rsrustfmt --check src/agent/runloop/unified/tool_routing.rsrustfmt --check src/agent/runloop/unified/tool_pipeline/pty_stream.rsrustfmt --check src/agent/runloop/unified/tool_pipeline/execution_runtime.rs./scripts/perf/baseline.sh baseline./scripts/perf/baseline.sh latest./scripts/perf/compare.sh
Result: all commands completed successfully for the touched areas.
Note on strict clippy:
RUSTC_WRAPPER= cargo clippy --workspace --all-targets -- -D warningscurrently fails due pre-existing unrelated lint debt in other crates (vtcode-ui,vtcode-config,vtcode-core,vtcodetests)- touched packages were validated with focused checks/tests and format checks
Performance sample output was written to .vtcode/perf/diff.md (single local sample; interpret as directional only).
Next Batch (Recommended)
- Cancellation/fairness pass
- prioritize tool pipeline and runloop
select!sites for cancellation-safety review - verify long-running work always yields or is delegated to
spawn_blocking
- Optional benchmark pass
- run
./scripts/perf/baseline.shbefore/after targeted lock-path changes in cache-heavy flows