KWWKComputerUseCore
May 8, 2026 ยท View on GitHub
Swift macOS computer-use runtime for driving native apps through Accessibility snapshots and background input delivery.
This package contains only the core runtime: action functions, snapshot/session management, background mouse/keyboard dispatch, screenshot capture, and app/window discovery. It intentionally does not depend on kwwk, agent frameworks, or AI SDKs.
Requirements
- macOS 14 or newer.
- Swift 6.1 or newer.
- Accessibility permission for the calling process.
Installation
Add the package to Package.swift:
.package(url: "https://github.com/EYHN/kwwk-computer-use-core.git", branch: "main")
Then add KWWKComputerUseCore to your target dependencies:
.product(name: "KWWKComputerUseCore", package: "kwwk-computer-use-core")
After version tags are published, prefer a semver requirement:
.package(url: "https://github.com/EYHN/kwwk-computer-use-core.git", from: "0.1.0")
Usage
Structured product integration:
import KWWKComputerUseCore
let cu = ComputerUseClient()
defer { cu.finish() }
let state = try cu.state(app: "Google Chrome")
let button = state.nodes.first {
\$0.role == "AXButton" && \$0.title == "Reload"
}
if let button {
try await cu.click(snapshotID: state.metadata.id, elementIndex: button.index)
}
Agent-facing formatted output:
import KWWKComputerUseCore
let cu = ComputerUseClient()
defer { cu.finish() }
let state = try cu.getAppState(app: "Google Chrome")
let snapshotID = state.metadata!.id
try await cu.click(snapshotID: snapshotID, elementIndex: 171)
try await cu.pressKey(snapshotID: snapshotID, key: "Escape")
Actions
listApps()apps()runningApps()openApp(_:)listWindows(app:)windows(app:)getAppState(app:windowTitle:includeScreenshot:)state(app:windowTitle:includeScreenshot:)click(snapshotID:elementIndex:)click(snapshotID:x:y:)typeText(snapshotID:text:elementIndex:)setValue(snapshotID:elementIndex:value:)pressKey(snapshotID:key:)scroll(snapshotID:elementIndex:direction:pages:)performSecondaryAction(snapshotID:elementIndex:action:)drag(snapshotID:fromX:fromY:toX:toY:)
The calling process needs macOS Accessibility permission for most actions.
Use apps(), runningApps(), windows(app:), and state(app:) when
integrating from product code that needs structured values instead of
agent-facing formatted text.
Coordinate click and drag calls require a snapshot captured with
includeScreenshot: true; element-index actions only need the snapshot metadata.
Testing
Run the default test suite:
swift package describe
swift test --explicit-target-dependency-import-check error
swift build -c release --explicit-target-dependency-import-check error
The default tests avoid real GUI side effects. End-to-end GUI probe tests are available for local development:
KWWK_COMPUTER_USE_CORE_RUN_GUI_PROBE_TESTS=1 swift test --filter InProcessComputerUseBehaviorTests
Those tests require Accessibility permission and prebuilt probe apps under
/private/tmp/kwwk-activation-probe.
Architecture
ComputerUseClient is the product-facing facade. It owns a
ComputerUseSession, so a sequence of actions can share background activation
and focus suppression state.
Lower-level callers can use ComputerUseAction directly when they need to
manage sessions themselves. The library intentionally returns
ComputerUseCommandOutput with both formatted text and structured metadata so
agent adapters can choose their own schema without coupling the core package to
any one framework.
License
MIT