Coop Testing And Validation

May 17, 2026 · View on GitHub

Docs review date: May 17, 2026

This document maps the release-facing validation commands to the actual suite graph in scripts/validate.ts. The canonical stage-based checklist lives in Production Release Checklist. The current release boundary lives in Current Release Status. The demo flow and deployment steps live in Demo & Deploy Runbook. Operator-only second-gate guidance lives in Live Rails Operator Runbook.

The latest recorded staged-launch validation snapshot is from April 19, 2026:

bun run test:coverage green at 86.56 / 78.02 / 87.19 / 86.56 (statements / branches / functions / lines)
bun run build green
bun run validate:store-readiness green
bun run validate:production-readiness green

Rerun the relevant gate on the current tree before using that snapshot as release signoff.

Manual real-Chrome confirmation of popup Capture Tab and Screenshot success paths remains required before a public Chrome Web Store release candidate is called fully signed off.

Core Commands

List all named suites:

bun run validate list

Fast confidence:

bun run validate smoke

Main extension workflow:

bun run validate core-loop

Chrome Web Store release gate:

bun run validate:store-readiness

Full pre-release extension gate:

bun run validate:production-readiness

Opt-in live-rails gate:

bun run validate:production-live-readiness

Browser, Chrome, And Computer Use Routing

Use the lightest tool that proves the real user surface:

Tool	Use for	Do not use as proof for
Codex Browser / Browser Use	Local app routes, receiver PWA preview routes, public pages without sign-in, screenshot and responsive proof.	Authenticated Chrome state, browser extensions, installed PWA shell, OS prompts.
Playwright mirrors	Repeatable CI/local checks for Browser-covered flows and layout invariants.	GUI-only permissions, toolbar popup grants, extension-card errors.
Codex Chrome extension	Signed-in Chrome/profile state and browser tasks that need the real Chrome profile.	Local unauthenticated dev routes where Browser is enough.
Computer Use	Unpacked extension popup/sidepanel, extension-card errors, installed PWA behavior, OS/browser permission prompts, and cross-app GUI flows.	Deterministic localhost route checks already covered by Browser or Playwright.

OpenAI's current Codex docs explicitly recommend the in-app Browser first for local development servers and Computer Use only when command output or structured/browser tools are not enough. Keep this split aligned with the official In-app browser, Codex Chrome extension, and Computer Use docs.

Canonical Suite Graph

Release Gates

store-readiness Runs build, unit:store-readiness, unit:extension-dist, and audit:store-readiness.
production-readiness Runs lint, build, popup-slice, unit:sidepanel-actions, unit:archive-hardening, sync-hardening, onchain-ui, unit:agent-loop, unit:onchain-config, unit:session-key, store-readiness, e2e:extension, e2e:receiver-sync, e2e:receiver-pwa-eval, e2e:agent-loop, and e2e:app:mobile.
production-live-readiness Runs production-readiness, arbitrum-safe-live, session-key-live, greengoods-live, archive-live, and fvm-registry-live.

Supporting Composites

popup-slice Runs unit:popup-actions and e2e:popup.
sync-hardening Runs unit:sync-hardening and e2e:sync.
onchain-ui Runs unit:onchain-ui.
core-loop Runs unit, build, and e2e:extension.
receiver-hardening Runs lint, unit, build, e2e:receiver-sync, and e2e:receiver-pwa-eval.

What The Browser Suites Actually Cover

bun run test:e2e:popup Covers real popup roundup into drafts, the exact screenshot permission error when automation does not get a genuine popup activeTab grant, screenshot manual-gate copy before review opens, file review cancel/save, microphone denial and retry, and post-failure popup recovery.
bun run test:e2e:extension Covers create/join coop flow, publish plus board/archive handoff, the focused trusted-helper run loop, and mock-path sidepanel member-account plus garden-pass actions.
bun run test:e2e:receiver-sync Covers receiver pair, private intake sync, sidepanel-closed receiver runtime, and multi-coop publish from extension review.
bun run test:e2e:receiver-pwa-eval Covers Browser-first receiver PWA checks for website-first routing, install education, Hatch mobile fit, mock-media audio save, Mate pairing defaults, and Roost failed-sync actions.
bun run test:e2e:sync Covers degraded and recovered sync runtime-health persisted across popup reopen. It does not do full network fault injection.
bun run test:e2e:agent-loop Runs the @agent-loop slice from e2e/extension.spec.cjs as a focused trusted-helper browser rehearsal.

Visual E2E

Playwright visual regression tests remain separate from the release suites:

bun run test:visual

That command runs:

e2e/visual-popup.spec.cjs
e2e/visual-sidepanel.spec.cjs

test:visual is not included in store-readiness or production-readiness.

Targeted Test Entry Points

bun run test:unit:popup-actions
bun run test:unit:sidepanel-actions
bun run test:unit:archive-hardening
bun run test:unit:sync-hardening
bun run test:unit:onchain-ui
bun run test:unit:onchain-config
bun run test:unit:agent-loop
bun run test:unit:session-key
bun run test:e2e:popup
bun run test:e2e:sync
bun run test:e2e:extension
bun run test:e2e:receiver-sync
bun run test:e2e:receiver-pwa-eval
bun run test:e2e:agent-loop
bun run test:e2e:app:mobile

Coverage Accounting

The repo-wide Vitest coverage report is currently thresholded for:

packages/shared/src
packages/app/src
packages/extension/src/runtime
packages/extension/src/views

Two important blind spots to remember when reading the aggregate percentage:

packages/api is discovered by the root Vitest run and its tests execute, but it is not part of the current coverage totals.
packages/extension/src/background also has tests, but it is outside the current coverage include list, so background-worker execution does not move the reported coverage percentage.

Treat the aggregate number as a useful gate for the measured surfaces, not as a full-repo coverage statement.

Local Safety Defaults

Keep these defaults for normal local development:

VITE_COOP_CHAIN=sepolia
VITE_COOP_ONCHAIN_MODE=mock
VITE_COOP_ARCHIVE_MODE=mock
VITE_COOP_SESSION_MODE=off

Local validation and demo env guidance should come from the repo-root .env.local, not package-local env files.

Live Validation

Safe Probe

Use this when validating Safe deployment without live archive or session-key execution:

bun run validate:arbitrum-safe-live

Required env:

VITE_PIMLICO_API_KEY
COOP_ONCHAIN_PROBE_PRIVATE_KEY

Optional:

COOP_ONCHAIN_PROBE_CHAIN=arbitrum

If the required env is missing, this probe prints a skip message and exits cleanly. That does not prove live Safe readiness.

Session-Key Probe

Use this when validating bounded Smart Session execution onchain:

bun run validate:session-key-live

Required env:

VITE_PIMLICO_API_KEY
COOP_SESSION_PROBE_PRIVATE_KEY

Optional:

COOP_SESSION_PROBE_CHAIN=arbitrum
COOP_SESSION_PROBE_SAFE_ADDRESS=0x...

If the required env is missing, this probe prints a skip message and exits cleanly. That does not prove live session-key readiness.

This probe:

deploys or reuses a probe Safe
validates the local garden-pass rule set before any live send
executes one allowed green-goods-create-garden action when the Safe supports the required session modules
confirms a disallowed action is rejected before send
revokes the session and confirms subsequent rejection

Archive Probe

Use this when validating trusted-node archive delegation material:

bun run validate:archive-live

Required env for a real delegation:

VITE_COOP_TRUSTED_NODE_ARCHIVE_SPACE_DID
VITE_COOP_TRUSTED_NODE_ARCHIVE_DELEGATION_ISSUER
VITE_COOP_TRUSTED_NODE_ARCHIVE_SPACE_DELEGATION

Commonly needed:

VITE_COOP_TRUSTED_NODE_ARCHIVE_AGENT_PRIVATE_KEY
VITE_COOP_TRUSTED_NODE_ARCHIVE_PROOFS
VITE_COOP_TRUSTED_NODE_ARCHIVE_ALLOWS_FILECOIN_INFO=true

Important:

bun run validate:archive-live requires the trusted-node archive env before it can prove live archive readiness.
An in-process static delegation fallback is available only when COOP_ALLOW_ARCHIVE_PROBE_FALLBACK=true; treat that explicit fallback as a wiring check, not a live-archive proof.

Full Live-Rails Gate

Use this only when the release candidate enables live Safe, session-key, Green Goods, archive, or Filecoin registry behavior:

bun run validate:production-live-readiness

That gate layers these suites on top of production-readiness:

bun run validate:arbitrum-safe-live
bun run validate:session-key-live
bun run validate:greengoods-live
bun run validate:archive-live
bun run validate:fvm-registry-live

Manual Release Checks That Still Matter

Automation already proves real popup roundup, popup manual-gate errors, file review/save, audio retry, and post-failure recovery.
Manually confirm successful popup Capture Tab and Screenshot saves in Chrome with a real user click because Playwright cannot reliably reproduce the popup activeTab grant.

Sidepanel And Receiver

A second profile can join and see published state.
Receiver pairing works on the intended origin and private intake sync still lands in the correct coop.
Publish reaches the Coops feed and the board route.
Archive receipts remain legible and export still works.

Green Goods And Session Rails

production-readiness now covers mock-path member-account provisioning and garden-pass issuance in the real sidepanel.
Live Safe and Smart Session execution still require the opt-in live probes.
Full live garden-pass execution may still skip if the probe Safe lacks ERC-7579 support.