Development
June 12, 2026 · View on GitHub
This guide covers local setup, helper builds, validation, and release notes for contributors.
Requirements
- macOS for native helper development and computer-use QA.
- Node.js
>=20.6.0. - Xcode command line tools for native helper builds.
- Pi for extension testing.
Local Setup
Install dependencies:
npm install
Run checks:
npm test
Run this checkout in Pi without loading another installed copy:
pi --no-extensions -e .
Helper Install Path
The runtime helper lives at:
~/.pi/agent/helpers/pi-computer-use/bridge
The helper needs:
- Accessibility
- Screen Recording
If permissions are missing, start Pi interactively and let the extension guide setup.
Helper Builds
Build for the current architecture into the repo prebuilt path:
npm run build:native
Build directly to the installed helper path. Use modern for macOS 14+ ScreenCaptureKit support, or legacy for the macOS 12+ CGWindow/screencapture helper:
node scripts/build-native.mjs --variant modern --output ~/.pi/agent/helpers/pi-computer-use/bridge
node scripts/build-native.mjs --variant legacy --output ~/.pi/agent/helpers/pi-computer-use/bridge
Build both release prebuilts for both helper variants:
node scripts/build-native.mjs --arch all --variant all
Release prebuilt helpers live at:
prebuilt/macos/arm64/modern/bridge
prebuilt/macos/arm64/legacy/bridge
prebuilt/macos/x64/modern/bridge
prebuilt/macos/x64/legacy/bridge
setup-helper.mjs selects modern on macOS 14+ and legacy on macOS 12/13, then copies the selected binary to ~/.pi/agent/helpers/pi-computer-use/bridge.
Local helper builds are ad-hoc codesigned by default. For release builds, use a Developer ID Application certificate:
node scripts/build-native.mjs --arch all --variant all \
--sign-identity "Developer ID Application: Your Team (TEAMID)" \
--hardened-runtime \
--timestamp
The default signing identifier is:
com.injaneity.pi-computer-use.bridge
Keep that identifier stable for release builds so macOS permissions remain tied to the same helper identity across updates.
Validation
For TypeScript and schema checks:
npm test
For documentation-only changes, proofreading markdown and checking touched links is usually enough.
Benchmarks
Use benchmark output when changing semantic target ranking, fallback policy, AX execution, browser handling, native helper behavior, permission/setup behavior, or payload efficiency.
The QA benchmark is a local Pi-extension harness, not a clone of CUAbench.ai/OSWorld/WebArena. It borrows their principles—task diversity, reset/cleanup, action efficiency, and regression checks—while measuring package-specific behavior: compact semantic AX results, selective image fallback, AX execution, latency, payload size, and optional CDP behavior.
Default benchmark, non-intrusive aside from inspecting already-running visible apps:
npm run benchmark:qa
Wider coverage that may open apps. TextEdit/Finder artifacts created by the harness are cleaned up by default:
npm run benchmark:qa:full
Browser tab/address-bar navigation is skipped by default. Run it only when you are okay with the active browser tab/window changing:
npx -y tsx benchmarks/qa.ts --allow-foreground-qa --allow-browser-navigation
Keep temporary benchmark windows/documents for debugging:
npx -y tsx benchmarks/qa.ts --allow-foreground-qa --allow-screen-takeover --leave-artifacts
Save and compare local results:
npx -y tsx benchmarks/qa.ts --allow-foreground-qa --output benchmarks/results/baseline.local.json
npx -y tsx benchmarks/qa.ts --allow-foreground-qa --baseline benchmarks/results/baseline.local.json --output benchmarks/results/current.local.json
For the CDP backend only (self-contained; launches a headless Chrome, needs no macOS permissions, and is also included in benchmark:qa runs under a separate cdp category):
npm run benchmark:cdp
Important metrics include AX-only ratio, vision fallback ratio, semantic coverage, AX execution ratio, latency, executed app/category/tool counts, and payload proxies (avgTextChars, avgImageBytes, avgContentJsonBytes, avgDetailsJsonBytes, avgPayloadBytes). In benchmarkSchemaVersion: 2, payload bytes are serialized content JSON plus serialized details JSON, not just text length.
Current goals and regression tolerances live in benchmarks/config.json.
Pull Requests
Before opening a PR:
- Open an issue.
- Get approval or alignment in the issue.
- Keep the change scoped.
- Include validation results.
- Attach the AI transcript if AI tools helped produce the PR.
See CONTRIBUTING.md for the project contribution policy.