E2E Testing Guide
May 30, 2026 ยท View on GitHub
Overview
Desktop E2E tests use WebDriverIO (WDIO) to drive the Tauri app through Appium:
| Platform | Driver | Port | App format | Selectors |
|---|---|---|---|---|
| Linux / Appium Chromium | Appium Chromium | 4723 | Debug binary | CSS / DOM |
| macOS / Appium Chromium | Appium Chromium | 4723 | .app bundle | CSS / DOM |
OpenHuman's desktop app currently uses the CEF runtime (tauri-runtime-cef). CI drives the Linux debug binary with Appium's Chromium driver; manual macOS and Windows E2E use the same Chromium-driver backend.
Quick start
Linux / Appium Chromium
# Install Appium and the Chromium driver (one-time)
npm install -g appium@3
appium driver install --source=npm appium-chromium-driver
# Build the E2E app
pnpm --filter openhuman-app test:e2e:build
# Run all flows
pnpm --filter openhuman-app test:e2e:all:flows
# Run a single spec
bash app/scripts/e2e-run-spec.sh test/e2e/specs/smoke.spec.ts smoke
On headless Linux, the harness runs under Xvfb for a virtual display.
macOS / Appium Chromium
# Install Appium + Chromium driver (one-time, needs Node 24+)
npm install -g appium@3
appium driver install --source=npm appium-chromium-driver
# Build the .app bundle
pnpm --filter openhuman-app test:e2e:build
# Run all flows
pnpm --filter openhuman-app test:e2e:all:flows
Docker on macOS (Linux harness locally)
Run the same Linux-based harness from macOS using Docker.
# Build + run all E2E flows
docker compose -f e2e/docker-compose.yml run --rm e2e
# Build the app first (if needed)
docker compose -f e2e/docker-compose.yml run --rm e2e \
pnpm --filter openhuman-app test:e2e:build
# Run a single spec
docker compose -f e2e/docker-compose.yml run --rm e2e \
bash app/scripts/e2e-run-spec.sh test/e2e/specs/smoke.spec.ts smoke
Requires Docker Desktop or Colima. The repo is bind-mounted so builds persist between runs.
Architecture
Platform detection
app/test/e2e/helpers/platform.ts exports:
isTauriDriver(), legacy shim that now always returnstruefor the DOM-capable Chromium sessionisMac2(), legacy shim that now always returnsfalsesupportsExecuteScript(),truebecause the Chromium driver supportsbrowser.execute()on every platform
Element helpers
app/test/e2e/helpers/element-helpers.ts provides a unified API:
| Helper | Appium Chromium |
|---|---|
waitForText(text) | XPath over DOM text content |
waitForButton(text) | button / [role="button"] XPath |
clickText(text) | Standard el.click() |
clickNativeButton(text) | Standard el.click() on button |
clickToggle() | [role="switch"] / input[type="checkbox"] |
waitForWindowVisible() | Window handle check |
waitForWebView() | document.readyState check |
hasAppChrome() | Window handle check |
dumpAccessibilityTree() | HTML page source |
Stable test IDs
Prefer stable data-testid hooks for UI affordances that E2E specs click or poll. Use the taxonomy <surface>-<element>-<id?>, for example:
cron-jobs-panel,cron-refreshcron-job-row-<jobId>,cron-job-toggle-<jobId>,cron-job-run-<jobId>,cron-job-view-runs-<jobId>,cron-job-remove-<jobId>settings-nav-<routeId>skill-row-<skillId>,skill-install-<skillId>,skill-uninstall-<skillId>thread-row-<threadId>,new-thread-button,send-message-buttononboarding-next-button
Use waitForTestId(testId) and clickTestId(testId) from element-helpers.ts when a spec targets one of these hooks. Keep text selectors for user-visible copy assertions, not row/action discovery.
Deep link helpers
app/test/e2e/helpers/deep-link-helpers.ts handles auth deep links:
- Appium Chromium:
browser.execute(window.__simulateDeepLink(url))on every platform - macOS fallback:
macos: deepLinkextension command, thenopen -a ...
For release candidates, also run one manual secondary-instance smoke on Linux or macOS when touching CEF preflight, single-instance, or deep-link startup code:
- Launch OpenHuman normally and leave it running.
- Trigger
openhuman://auth?token=e2e-token&key=auththrough the OS opener. - Confirm the already-running window receives the callback and does not start a second full CEF instance.
- Confirm the secondary process exits cleanly without a CEF cache-lock error.
This catches the class of regressions where a secondary process exits during CEF cache preflight before Tauri's deep-link forwarding path is installed.
Writing cross-platform specs
- Use helpers from
element-helpers.ts, never use rawXCUIElementType*selectors in specs - Use
clickNativeButton(text)instead of inline button-clicking code - Use
hasAppChrome()instead of checking forXCUIElementTypeMenuBar - Use
waitForWebView()instead of checking forXCUIElementTypeWebView - For macOS-only tests, use
process.platformguards or separate spec files - Use
navigateViaHash(route)for hash routes; it waits for the hash,document.readyState, and a mounted React root before returning. After onboarding,walkOnboarding()also waits for#/homeplus a Home-page marker before specs navigate elsewhere.
Environment variables
| Variable | Default | Description |
|---|---|---|
APPIUM_PORT | 4723 | Appium server port |
E2E_MOCK_PORT | 18473 | Mock backend server port |
OPENHUMAN_WORKSPACE | (temp dir) | App workspace directory |
OPENHUMAN_SERVICE_MOCK | 0 | Enable service mock mode |
OPENHUMAN_E2E_MODE | unset | Enables destructive test-support RPCs; the E2E runner sets this to 1 |
OPENHUMAN_E2E_AUTH_BYPASS | unset | Enable JWT bypass auth |
DEBUG_E2E_DEEPLINK | (verbose) | Set to 0 to silence deep link logs |
E2E_FORCE_CARGO_CLEAN | unset | Force cargo clean before E2E build |
CI workflows
Push / PR checks
The default pull-request gate is .github/workflows/pr-ci.yml. It builds one Linux E2E-compatible desktop artifact, then runs the Linux Appium/Chromium mega-flow lane and the Playwright web lane in parallel with Rust and coverage jobs.
macOS and Windows desktop E2E do not run on every PR. Use the manually dispatched E2E workflow, or release pretest workflows, when cross-platform desktop signal is needed.
macOS / Appium Chromium
macOS/Appium Chromium is available for local runs and through the manually dispatched E2E workflow:
- Installs Appium + Chromium driver
- Builds the
.appbundle - Runs all E2E flows
Troubleshooting
Linux: "WebView not ready" timeout
For the default CEF runtime, this usually means a stale local runner is trying to drive a CEF-backed WebView through WebKitWebDriver. Current CI uses the Appium Chromium driver on Linux; use app/scripts/e2e-run-session.sh or the PR CI workflow for the supported Linux path.
Ensure DISPLAY is set and Xvfb is running:
export DISPLAY=:99
Xvfb :99 -screen 0 1280x1024x24 &
Also ensure dbus is started (required by webkit2gtk):
eval $(dbus-launch --sh-syntax)
Linux: Appium Chromium driver not found
npm install -g appium@3
appium driver install --source=npm appium-chromium-driver
macOS: Deep links not working in tauri dev
Deep links require a .app bundle. Use pnpm tauri build --debug --bundles app instead.
Docker: Build is slow on first run
The first Docker build compiles Rust and installs the E2E harness dependencies. Subsequent runs use cached layers. Cargo registry and git sources are cached via Docker volumes.
Spec: Notifications
File: app/test/e2e/specs/notifications.spec.ts
Tests notification RPC methods via the live core sidecar and the Notifications UI page:
notification_ingest, creates a new notification via core RPCnotification_list, verifies the ingested notification is returnednotification_mark_read, marks a notification as readnotification_stats, checks aggregate statistics shape- UI: Notifications page renders the integration notifications section (
[data-testid="integration-notifications-section"]) - UI: Notifications page shows the System Events section (
[data-testid="system-events-section"])
Run:
bash app/scripts/e2e-run-spec.sh test/e2e/specs/notifications.spec.ts notifications
Platform note: RPC tests (notification_ingest, notification_list, notification_mark_read, notification_stats) run through the unified Appium Chromium backend. UI assertions require browser.execute() support, which the current backend provides on every platform.
Agent-observable artifact flow
For a canonical, inspectable run that drops screenshots, page-source dumps, and mock request logs on disk:
bash app/scripts/e2e-agent-review.sh
Artifacts land in app/test/e2e/artifacts/<timestamp>-agent-review/. Full details + helper API: AGENT-OBSERVABILITY.md. Any failing test triggers wdio.conf.ts's afterTest hook, which writes failure-*.png + failure-*.source.xml into the same run dir.
Rust inference provider E2E
These tests (tests/inference_provider_e2e.rs) use wiremock to mock HTTP upstreams and require no live LLM API calls. They cover OpenAI-compat chat, Anthropic auth style, per-model temperature suppression, Ollama local provider, and the /v1 HTTP endpoint auth layer.
# Local:
bash scripts/test-rust-inference-e2e.sh
# Via Docker (Linux, same image as CI):
docker compose -f e2e/docker-compose.yml run --rm inference-e2e