OpenHarness

May 6, 2026 · View on GitHub

openHarness logo

OpenHarness

        ___
       /   \
      (     )        ___  ___  ___ _  _ _  _   _ ___ _  _ ___ ___ ___
       `~w~`        / _ \| _ \| __| \| | || | /_\ | _ \ \| | __/ __/ __|
       (( ))       | (_) |  _/| _|| .` | __ |/ _ \|   / .` | _|\__ \__ \
        ))((        \___/|_|  |___|_|\_|_||_/_/ \_\_|_\_|\_|___|___/___/
       ((  ))
        `--`

AI coding agent in your terminal. Works with any LLM -- free local models or cloud APIs.

OpenHarness demo

npm version npm downloads license tests tools Node.js 18+ TypeScript GitHub stars GitHub issues PRs Welcome

English | 简体中文


Table of Contents


Quick Start

npm install -g @zhijiewang/openharness
oh

That's it. OpenHarness auto-detects Ollama and starts chatting. No API key needed.

Python SDK: there's also an official Python SDK for driving oh from Python programs (notebooks, batch scripts, ML pipelines). Install with pip install openharness-sdk after the npm install (the PyPI distribution is openharness-sdk because the unqualified name is taken), then from openharness import query. See python/README.md.

TypeScript SDK: drive oh from Node.js (VS Code extensions, Electron apps, build scripts) with @zhijiewang/openharness-sdknpm install @zhijiewang/openharness-sdk, then import { query, OpenHarnessClient, tool } from "@zhijiewang/openharness-sdk". Mirrors the Python SDK surface (streaming events, stateful sessions, custom tools, permission callback, session resume). See packages/sdk/README.md.

oh init                               # interactive setup wizard (provider + cybergotchi)
oh                                    # auto-detect local model
oh --model ollama/qwen2.5:7b         # specific model
oh --model gpt-4o                     # cloud model (needs OPENAI_API_KEY)
oh --trust                            # auto-approve all tool calls
oh --auto                             # auto-approve, block dangerous bash
oh -p "fix the tests" --trust         # headless mode (single prompt, exit)
oh run "review code" --json           # CI/CD with JSON output

In-session commands:

/rewind                               # undo last AI file change (checkpoint restore)
/roles                                # list agent specializations
/vim                                  # toggle vim mode
Ctrl+O                                # flush transcript to scrollback for review

Why OpenHarness?

Most AI coding agents are locked to one provider or cost $20+/month. OpenHarness works with any LLM -- run it free with Ollama on your own machine, or connect to any cloud API. Every AI edit is git-committed and reversible with /undo.

Terminal UI

OpenHarness features a sequential terminal renderer inspired by Ink/Claude Code's default mode. Completed messages flush to native scrollback (scrollable), while the live area (streaming, spinner, input) rewrites in-place using relative cursor movement.

Keybindings

KeyAction
EnterSubmit prompt
Alt+EnterInsert newline (multi-line input)
/ Navigate input history
Ctrl+CCancel current request / exit
Ctrl+A / Ctrl+EJump to start / end of input
Ctrl+OToggle thinking block expansion
Ctrl+KToggle code block expansion in messages
TabAutocomplete slash commands / file paths / cycle tool outputs
/vimToggle Vim mode (normal/insert)

Scrolling is handled by the terminal's native scrollbar. Completed messages flow into the terminal scrollback buffer. Use your terminal's search (e.g., Ctrl+Shift+F in VS Code) to search conversation history.

Features

  • Markdown rendering — headings, code blocks, bold, italic, lists, tables, blockquotes, links
  • Syntax highlighting — keywords, strings, comments, numbers, types (JS/TS/Python/Rust/Go and 20+ languages)
  • Collapsible code blocks — blocks over 8 lines auto-collapse; Ctrl+K to expand all
  • Collapsible thinking — thinking blocks collapse to a one-line summary after completion; Ctrl+O to expand
  • Shimmer spinner — animated indicator with stage label (Thinking, Running <Tool>, Calling <server>:<tool>, Running N tools) and color transitions (magenta → yellow at 30s → red at 60s)
  • Tool call display — args preview, live streaming output, result summaries (line counts, elapsed time), expand/collapse with Tab. Tool name color-coded by category (read tools cyan, mutating tools yellow, exec tools magenta, MCP tools green)
  • Rich tool output — JSON files render as a colored static tree (depth-3 collapse, line truncation); markdown files render with full styling (headings, code blocks, tables) instead of plain split-on-newline. Renderer dispatches via outputType field stamped by FileReadTool / WebFetchTool, with a heuristic fallback for unstamped tools
  • Nested tool calls — when Agent or ParallelAgents spawns inner tool calls (Read, Bash, Edit), the children render indented under their spawning parent. ParallelAgents shows per-task Task wrapper rows so child calls group by task instead of flat under the bundled parent. Depth-3 indent limit with … (N more level) collapse marker
  • Multi-line input wrap glyph — every non-last line of a multi-line input ends with a dim continuation marker so the wrap is visually obvious
  • Permission prompts — bordered box with risk coloring, bold colored Yes/No/Diff keys, syntax-highlighted inline diffs
  • Status line — model name, token count, cost, context usage bar (customizable via config)
  • Context warning — yellow alert when context window exceeds 75%
  • Native terminal scrollbar — completed messages flow into scrollback; use your terminal's scrollbar and search
  • Multi-line inputAlt+Enter for newlines; paste detection auto-inserts newlines
  • Autocomplete — slash commands and file paths with descriptions; Tab to cycle
  • File path autocomplete — Tab-completes paths with [dir]/[file] indicators
  • Session browser/browse to interactively browse and resume past sessions
  • Companion mascot — animated Cybergotchi in the footer (toggle with /companion off|on)

Themes

oh --light                    # light theme for bright terminals
/theme light                  # switch mid-session (saved automatically)
/theme dark                   # switch back

Theme preference is saved to .oh/config.yaml and persists across sessions.

Custom Status Line

Customize the status bar format in .oh/config.yaml:

statusLineFormat: '{model} │ {tokens} │ {cost} │ {ctx}'

Available variables: {model}, {tokens} (input↑ output↓), {cost} ($X.XXXX), {ctx} (context usage bar). Empty sections are automatically collapsed.

Tools (44)

ToolRiskDescription
Core
BashhighExecute shell commands with live streaming output (AST safety analysis)
PowerShellhighExecute PowerShell commands (Windows-native scripting)
ReadlowRead files with line ranges, PDF support
ImageReadlowRead images/PDFs for multimodal analysis
WritemediumCreate or overwrite files
EditmediumSearch-and-replace edits
MultiEditmediumAtomic multi-file edits (all succeed or none)
GloblowFind files by pattern
GreplowRegex content search with context lines
LSlowList directory contents with sizes
Web
WebFetchmediumFetch URL content (SSRF-protected)
WebSearchmediumSearch the web
ExaSearchmediumNeural web search via Exa (requires EXA_API_KEY)
RemoteTriggerhighHTTP requests to webhooks/APIs
Tasks
TaskCreatelowCreate structured tasks
TaskUpdatelowUpdate task status
TaskListlowList all tasks
TaskGetlowGet task details
TaskStoplowStop a running task
TaskOutputlowGet task output
TodoWritelowManage session task checklist (Claude Code-compatible)
Agents
AgentmediumSpawn a sub-agent (with role specialization)
ParallelAgentmediumDispatch multiple agents with DAG dependencies
SendMessagelowAgent-to-agent peer messaging
AskUserlowAsk user a question with options
Scheduling
CronCreatemediumSchedule recurring tasks
CronDeletemediumRemove scheduled tasks
CronListlowList all scheduled tasks
ScheduleWakeuplowSelf-pace the next /loop iteration (cache-aware)
Planning
EnterPlanModelowEnter structured planning mode
ExitPlanModelowExit planning mode
Pipelines
PipelinemediumRun a sequence of tasks with output passed between steps
Code Intelligence
DiagnosticslowLSP-based code diagnostics
NotebookEditmediumEdit Jupyter notebooks
Memory & Discovery
MemorylowSave/list/search persistent memories
SkilllowInvoke a skill from .oh/skills/
ToolSearchlowFind tools by description
SessionSearchlowSearch prior sessions for relevant context
MCP
ListMcpResourceslowList resources from connected MCP servers
ReadMcpResourcelowRead a specific MCP resource by URI
Git Worktrees
EnterWorktreemediumCreate isolated git worktree
ExitWorktreemediumRemove a git worktree
Process
KillProcesshighStop processes by PID or name
MonitormediumRun a background command and stream each output line back to the agent

Low-risk read-only tools auto-approve. Medium and high risk tools require confirmation in ask mode. Use --trust or --auto to skip prompts.

Slash Commands

Over 80 commands are registered. The most-used ones are grouped below; see /help in-session for the full list. Aliases: /q exit, /h help, /c commit, /m model, /s status.

Session:

CommandDescription
/clearClear conversation history
/compactCompress conversation to free context
/exportExport conversation to markdown
/copy [n]Copy the Nth-last assistant response to the system clipboard
/history [n]List recent sessions; /history search <term> to search
/browseInteractive session browser with preview
/resume <id>Resume a saved session
/forkFork current session

Git:

CommandDescription
/diffShow uncommitted git changes
/undoUndo last AI commit
/commit [msg]Create a git commit
/logShow recent git commits

Info:

CommandDescription
/helpShow all available commands (categorized)
/costShow session cost and token usage
/statusShow model, mode, git branch, MCP servers
/configShow configuration
/filesList files in context
/model <name>Switch model mid-session
/memoryView and search memories
/doctorRun diagnostic health checks
/hooksList loaded hooks grouped by event
/reload-pluginsHot-reload plugins, skills, hooks, and MCP server connections without restarting the session

Settings:

CommandDescription
/theme dark|lightSwitch theme (saved to config)
/vimToggle Vim mode
/companion off|onToggle companion visibility
/keysShow keyboard shortcuts
/keybindingsOpen ~/.oh/keybindings.json in $EDITOR (creates a starter file if missing)

AI:

CommandDescription
/plan <task>Enter plan mode
/reviewReview recent code changes
/summarizeSummarize the current conversation
/recapOne-sentence recap of the session (lighter than /summarize)

Pet:

CommandDescription
/cybergotchiFeed, pet, rest, status, rename, or reset your companion

Permission Modes

Control how aggressively OpenHarness auto-approves tool calls:

ModeFlagBehavior
ask--permission-mode askPrompt for medium/high risk operations (default)
trust--trustAuto-approve everything
deny--denyOnly allow low-risk read-only operations
acceptEdits--permission-mode acceptEditsAuto-approve file edits, ask for Bash/WebFetch/Agent
plan--permission-mode planRead-only mode — block all write operations
auto--autoAuto-approve all, block dangerous bash (AST-analyzed)
bypassPermissions--permission-mode bypassPermissionsApprove everything unconditionally (CI only)

Bash commands are analyzed by a lightweight AST parser that detects destructive patterns (rm -rf, git push --force, curl | bash, etc.) and adjusts risk level accordingly.

Set permanently in .oh/config.yaml: permissionMode: 'acceptEdits'

Hooks

Run shell scripts automatically at key session events by adding a hooks block to .oh/config.yaml:

hooks:
  - event: sessionStart
    command: "echo 'Session started' >> ~/.oh/session.log"

  - event: preToolUse
    command: "scripts/check-tool.sh"
    match: Bash   # optional: only trigger for this tool name

  - event: postToolUse
    command: "scripts/after-tool.sh"

  - event: sessionEnd
    command: "scripts/cleanup.sh"

Event types (27 total — matches Claude Code's stable surface):

EventWhen it firesCan block?
sessionStartSession begins
sessionEndSession ends
turnStartTop-level agent turn begins (after user prompt accepted)
turnStopTop-level agent turn ends (mirrors Claude Code's Stop)
userPromptSubmitBefore user prompt reaches the LLMyes — decision: deny
userPromptExpansionSlash command produces an expanded prompt (audit trail)
preToolUseBefore each tool callyes — exit code 1 / decision: deny
postToolUseAfter successful tool execution
postToolUseFailureAfter tool throws or returns isError: true
postToolBatchOnce after a turn's full set of tool calls all resolve, before the next model call
permissionRequestWhen a tool needs approval (between preToolUse and the prompt)yes — decision: allow|deny|ask
permissionDeniedWhen a tool call is denied (hook / user / headless / policy)
fileChangedAfter a tool modifies a file
cwdChangedAfter working directory changes
subagentStartA sub-agent is spawned
subagentStopA sub-agent completes
preCompactBefore conversation compaction
postCompactAfter conversation compaction
configChange.oh/config.yaml is modified during the session
notificationA notification is dispatched
taskCreatedTaskCreate persists a new task
taskCompletedTaskUpdate transitions a task to completed
worktreeCreateEnterWorktreeTool creates an isolated git worktree
worktreeRemoveExitWorktreeTool removes a git worktree
elicitationAn MCP server requests user input via elicitation/createyes — decision: allow|deny
elicitationResultAfter the elicitation response has been decided (audit trail)
instructionsLoadedloadRulesAsPrompt rebuilt the system prompt with rules in scope

Set disableAllHooks: true in .oh/config.yaml to globally disable hook execution while keeping definitions on disk for auditability.

Live introspection: run /hooks in-session to see exactly which hooks are loaded, grouped by event.

Environment variables available to hook scripts:

VariableDescription
OH_EVENTEvent type (sessionStart, preToolUse, etc.)
OH_TOOL_NAMEName of the tool being called (tool events only)
OH_TOOL_ARGSJSON-encoded tool arguments (tool events only)
OH_TOOL_OUTPUTJSON-encoded tool output (postToolUse only)
OH_TOOL_INPUT_JSONFull JSON tool input (tool events only)
OH_SESSION_ID / OH_MODEL / OH_PROVIDER / OH_PERMISSION_MODECurrent session context
OH_COST / OH_TOKENSRunning cost and token totals
OH_FILE_PATHPath that changed (fileChanged only)
OH_NEW_CWDNew working directory (cwdChanged only)
OH_TURN_NUMBER / OH_TURN_REASONTurn boundary context (turnStart / turnStop)

Use match to restrict a hook to a specific tool name (e.g., match: Bash only triggers for the Bash tool). Substring, glob (Cron*), and /regex/flags patterns are all supported.

Set jsonIO: true on a command hook to opt into structured JSON I/O — the harness sends {event, ...context} on stdin and reads {decision, reason, hookSpecificOutput} from stdout. HTTP hooks accept the same response shape. See docs/hooks.md for the full reference.

Cybergotchi

OpenHarness ships with a Tamagotchi-style companion that lives in the side panel. It reacts to your session in real time — celebrating streaks, complaining when tools fail, and getting hungry if you ignore it.

Hatch one:

oh init        # wizard includes cybergotchi setup
/cybergotchi   # or hatch mid-session

Commands:

/cybergotchi feed      # +30 hunger
/cybergotchi pet       # +20 happiness
/cybergotchi rest      # +40 energy
/cybergotchi status    # show needs + lifetime stats
/cybergotchi rename    # give it a new name
/cybergotchi reset     # start over with a new species

Needs decay over time (hunger fastest, happiness slowest). Feed and pet your gotchi to keep it happy.

Evolution — your gotchi evolves based on lifetime milestones:

  • Stage 1 (✦ magenta): 10 sessions or 50 commits
  • Stage 2 (★ yellow + crown): 100 tasks completed or a 25-tool streak

18 species to choose from: duck, cat, owl, penguin, rabbit, turtle, snail, octopus, axolotl, cactus, mushroom, chonk, capybara, goose, and more.

MCP Servers

Connect any MCP (Model Context Protocol) server by editing .oh/config.yaml:

provider: anthropic
model: claude-sonnet-4-6
permissionMode: ask
mcpServers:
  - name: filesystem
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
  - name: github
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: ghp_...

MCP tools appear alongside built-in tools. /status shows connected servers.

MCP server prompts as slash commands — servers that expose prompts/list (e.g., GitHub, Sentry, Linear) get their prompts surfaced as /<server>:<prompt> slash commands automatically. Arguments use a key=value syntax with quoting:

/github:summarize-pr repo=acme/widget pr=42
/sentry:triage-issue issue=ABC-123 severity="high priority"

Required arguments declared by the prompt template surface a usage error if missing (no model call). Run /reload-plugins to re-discover prompts after editing your MCP config.

Remote MCP servers (HTTP / SSE)

mcpServers:
  - name: linear
    type: http
    url: https://mcp.linear.app/mcp
    headers:
      Authorization: "Bearer ${LINEAR_API_KEY}"

See docs/mcp-servers.md for the full reference. See docs/mcp-servers.md for OAuth 2.1 setup (auto-triggered on 401; /mcp-login and /mcp-logout commands available).

MCP Server Registry — browse and install from a curated catalog:

/mcp-registry              # browse all available servers
/mcp-registry github       # show install config for a specific server
/mcp-registry database     # search by category

Categories: filesystem, git, database, api, search, productivity, dev-tools, ai.

Git Integration

OpenHarness auto-commits AI edits in git repos:

oh: Edit src/app.ts                    # auto-committed with "oh:" prefix
oh: Write tests/app.test.ts
  • Every AI file change is committed automatically
  • /undo reverts the last AI commit (only OH commits, never yours)
  • /diff shows what changed
  • Your dirty files are safe — committed separately before AI edits

Checkpoints & Rewind

Every file modification is automatically checkpointed before execution. If something goes wrong:

/rewind           # restore files from the last checkpoint
/undo             # revert the last AI git commit

Checkpoints are stored in .oh/checkpoints/ and cover FileWrite, FileEdit, and Bash commands that modify files.

Verification Loops

After every file edit (Edit, Write, MultiEdit), openHarness automatically runs language-appropriate lint/typecheck commands and feeds the results back into the agent context. This is the single highest-impact harness engineering pattern — research shows 2-3x quality improvement from automated feedback.

Auto-detection — if your project has tsconfig.json, .eslintrc*, pyproject.toml, go.mod, or Cargo.toml, verification rules are detected automatically. No configuration needed.

Custom rules via .oh/config.yaml:

verification:
  enabled: true       # default: true (auto-detect)
  mode: warn          # 'warn' appends to output, 'block' marks as error
  rules:
    - extensions: [".ts", ".tsx"]
      lint: "npx tsc --noEmit 2>&1 | head -20"
      timeout: 15000
    - extensions: [".py"]
      lint: "ruff check {file} 2>&1 | head -10"

The agent sees [Verification passed] or [Verification FAILED] with the linter output after each edit, enabling self-correction.

Memory Consolidation

On session exit, openHarness automatically prunes stale memories using temporal decay:

  • Memories not accessed in 30+ days lose 0.1 relevance per 30-day period
  • Memories below 0.1 relevance are permanently deleted
  • Updated relevance scores are persisted to memory files

This keeps the memory system lean and relevant. Configure in .oh/config.yaml:

memory:
  consolidateOnExit: true   # default: true

Scheduled Tasks (Cron)

Create recurring tasks that run automatically in the background:

# Via slash commands
/cron list                    # show all scheduled tasks
/cron create "check-tests"    # create a new task (interactive)
/cron delete <id>             # remove a task

Schedule syntax: every 5m, every 2h, every 1d

The cron executor checks every 60 seconds for due tasks and runs them via sub-queries. Results are stored in ~/.oh/crons/history/.

Agent Roles

Dispatch specialized sub-agents for focused tasks:

/roles            # list all available roles
RoleDescriptionTools
code-reviewerFind bugs, security issues, style problemsRead-only
test-writerGenerate unit and integration testsRead + Write
docs-writerWrite documentation and commentsRead + Write + Edit
debuggerSystematic bug investigationRead-only + Bash
refactorerSimplify code without changing behaviorAll file tools + Bash
security-auditorOWASP, injection, secrets, CVE scanningRead-only + Bash
evaluatorEvaluate code quality and run tests (read-only)Read-only + Bash + Diagnostics
plannerDesign step-by-step implementation plansRead-only + Bash
architectAnalyze architecture and design structural changes (hands off to editor)Read-only
editorApply an architect's plan as code edits, no re-planningRead + Edit + Write + MultiEdit + Bash
migratorSystematic codebase migrations and upgradesAll file tools + Bash

Each role restricts the sub-agent to only its suggested tools. You can also pass allowed_tools explicitly:

Agent({ subagent_type: 'evaluator', prompt: 'Run all tests and report results' })
Agent({ allowed_tools: ['Read', 'Grep'], prompt: 'Search for all TODO comments' })

Architect → Editor (cost-saving multi-file edits)

For larger changes that span multiple files, dispatch a two-pass architecteditor workflow. The architect (powerful model) reads the codebase and outputs a structured plan; the editor (fast model) applies it mechanically without re-planning. When modelRouter is configured, OH automatically routes the architect role to your powerful tier and the editor role to your fast tier — typical cost reduction is 30-50% on multi-file edits versus running both passes on the powerful model.

Agent({ subagent_type: 'architect', prompt: 'Plan a migration from option A to option B across src/' })
# Hand the resulting plan to:
Agent({ subagent_type: 'editor', prompt: '<paste plan>' })

Sub-agent permission isolation

Each Agent call accepts a permission_mode override that narrows the parent's permission mode (never loosens it). Useful when running in trust and you want a subagent's review/audit pass to stay strictly read-only:

Agent({ subagent_type: 'code-reviewer', prompt: '...', permission_mode: 'plan' })
Agent({ subagent_type: 'security-auditor', prompt: '...', permission_mode: 'deny' })

If a less-restrictive mode is requested (e.g. parent is ask, subagent requests trust), the harness silently clamps to the parent — a model can never use a sub-agent to escape user-approval gates.

Read-only roles default to plan automatically. code-reviewer, evaluator, security-auditor, architect, and planner ship with permissionMode: 'plan' — spawn them under any parent and they're statically read-only, no permission_mode override needed. Markdown-defined agents in .oh/agents/*.md can set their own default with permissionMode: plan (or permission-mode: plan) frontmatter.

Headless Mode

Run a single prompt without interactive UI — perfect for CI/CD and scripting:

# Chat command with -p flag (recommended)
oh -p "fix the failing tests" --model ollama/llama3 --trust
oh -p "review src/query.ts" --auto --output-format json

# Run command (alternative)
oh run "fix the failing tests" --model ollama/llama3 --trust
oh run "add error handling to api.ts" --json    # JSON output

# Pipe stdin
cat error.log | oh run "what's wrong here?"
git diff | oh run "review these changes"

# Hard cap on session cost — agent halts at the threshold with reason: "budget_exceeded"
oh run "review the diff" --model claude-sonnet-4-6 --max-budget-usd 0.50
oh session --model gpt-4o --max-budget-usd 5

CLI flags for CI / SDK use

FlagEffect
--bareSkip optional startup work (project detection, plugins, memory, skills, MCP). System prompt is just the tool-use baseline. Faster startup on repos with many CLAUDE.md / RULES.md files.
--debug [categories]Enable categorized debug logs. --debug alone enables all; --debug mcp,hooks filters. Falls back to OH_DEBUG env var.
--debug-file <path>Append debug lines to a file instead of stderr. Falls back to OH_DEBUG_FILE.
--mcp-config <path>Load MCP servers from an external JSON file (in addition to .oh/config.yaml).
--strict-mcp-configWith --mcp-config, ignore .oh/config.yaml MCP servers entirely.
--system-prompt-file <path> / --append-system-prompt-file <path>File-path variants of --system-prompt / --append-system-prompt.
--no-session-persistenceSkip writing the session record to ~/.oh/sessions/ for ephemeral CI runs.
--fallback-model <model>Fallback used when the primary fails with a retriable error. REPLACES .oh/config.yaml fallbackProviders for this run.
--permission-prompt-tool <mcp_tool>Delegate tool-permission decisions to a configured MCP tool (e.g. mcp__myperm__check).
--init / --init-onlyRun the interactive setup wizard before / instead of the command.

All flags work on both oh run and oh session. See oh run --help and oh session --help for the full surface.

Structured output with --json-schema

Constrain the model's output to a JSON Schema. Useful for CI scripts that parse model output programmatically without regex heuristics:

oh -p "output {\"ok\": true, \"count\": 3} as JSON" \
  --trust \
  --json-schema '{"type":"object","properties":{"ok":{"type":"boolean"},"count":{"type":"integer"}},"required":["ok","count"]}'

Behavior:

  • stdout: the validated JSON (single line), only when it passes the schema.
  • stderr: structured errors on failure, plus the raw model output for debugging.
  • Exit codes: 0 valid, 2 malformed schema, 3 model output was not JSON, 4 JSON didn't match the schema.

Supported keywords: type, properties, required, items, enum. For richer validation, pipe the output through a dedicated validator.

GitHub Action for PR Review

OpenHarness includes a built-in GitHub Action for automated code review:

# .github/workflows/ai-review.yml
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: ./.github/actions/review
        with:
          model: 'claude-sonnet-4-6'
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}

Exit code 0 on success, 1 on failure.

Providers

# Local (free, no API key needed)
oh --model ollama/llama3
oh --model ollama/qwen2.5:7b

# Cloud
OPENAI_API_KEY=sk-... oh --model gpt-4o
ANTHROPIC_API_KEY=sk-ant-... oh --model claude-sonnet-4-6
OPENROUTER_API_KEY=sk-or-... oh --model openrouter/meta-llama/llama-3-70b

# llama.cpp / GGUF
oh --model llamacpp/my-model

# LM Studio
oh --model lmstudio/my-model

llama.cpp / GGUF (local, no Ollama needed)

For direct GGUF support via llama-server, without the overhead of Ollama. Often faster for large models.

Prerequisites:

Start llama-server:

llama-server --model ./your-model.gguf --port 8080 --alias my-model

Configure via oh init:

  • Run oh init and select "llama.cpp / GGUF" when prompted

Or configure manually in .oh/config.yaml:

provider: llamacpp
model: my-model
baseUrl: http://localhost:8080
permissionMode: ask

Run:

oh
oh --model llamacpp/my-model
oh models                    # list available models

ACP (Agent Client Protocol)

Speak Agent Client Protocol over stdin/stdout so editors that support ACP — Zed, JetBrains via the ACP plugin, Cline, OpenCode — can drive openHarness as the underlying agent. No bespoke IDE extension required:

oh acp                                          # uses provider/model from .oh/config.yaml
oh acp --provider anthropic --model claude-sonnet-4-6

Configure your editor's ACP integration to launch oh acp as the agent command. The session-update events (text chunks, tool calls, tool results) are translated automatically from openHarness's stream protocol; permission prompts currently use openHarness's own flow rather than the ACP requestPermission path (filed for follow-up). The @agentclientprotocol/sdk package is an optionalDependency — if it didn't install, oh acp exits with a clear install hint rather than silently failing.

Auth

Provider-agnostic credential management. Local LLMs (Ollama / llama.cpp / LM Studio) need no auth — configure them via oh init.

oh auth login [provider] [--key <value>]   # store API key for a provider
oh auth logout [provider]                   # clear stored API key
oh auth status                              # show stored providers + env-var overrides

[provider] defaults to your configured default. --key supplies the value inline; otherwise OH prompts (TTY) or reads from stdin (piped).

Script-based key resolution (apiKeyHelper)

Avoid storing keys in plaintext / the encrypted store by plugging in a helper script (1Password, pass, vault, cloud secret manager). The configured command runs at credential-fetch time with OH_PROVIDER set, and its trimmed stdout becomes the key.

# .oh/config.yaml
apiKeyHelper: 'op read "op://Personal/Anthropic/key"'

Resolution priority: env var → encrypted store → apiKeyHelper → legacy plaintext config.

Update

oh update                    # detects how OH was installed (npm-global / npx / local clone) and prints the right upgrade command

Configuration Hierarchy

Config is loaded in layers (later overrides earlier):

  1. Global ~/.oh/config.yaml — default provider, model, theme for all projects
  2. Project .oh/config.yaml — project-specific settings
  3. Local .oh/config.local.yaml — personal overrides (gitignored)

Set your default provider once globally:

# ~/.oh/config.yaml
provider: ollama
model: llama3
permissionMode: ask
theme: dark
language: zh-CN        # optional — respond in this language (code stays as-is)
outputStyle: default   # optional — "default", "explanatory", "learning", or a custom name

Then per-project configs only need what's different:

# .oh/config.yaml
model: codellama   # override just the model

Output Styles

Swap the agent's personality without touching its core instructions. Built-ins:

  • default — standard software engineering assistant (no preface)
  • explanatory — adds an ## Insights section after each task explaining why the agent made its choices
  • learning — leaves 1–3 TODO(human) markers at strategic points so you write the instructive parts yourself

Create your own styles as markdown files with YAML frontmatter. Save to .oh/output-styles/<name>.md (project) or ~/.oh/output-styles/<name>.md (user). Project shadows user shadows built-in.

---
name: code-review
description: Focused code review mode
---

Review rigorously. For every function, ask: is the logic correct, is error handling complete, are there edge cases ignored?

Activate with outputStyle: code-review in .oh/config.yaml.

Project Rules

Create .oh/RULES.md in any repo (or run oh init):

- Always run tests after changes
- Use strict TypeScript
- Never commit to main directly

Rules load automatically into every session.

openHarness also reads any of the following project-instruction files if present (additive, parent-first):

  • CLAUDE.md (Anthropic convention) — and hierarchical CLAUDE.md from parent dirs, plus ~/.claude/CLAUDE.md for user-global
  • AGENTS.md (agents.md cross-tool standard, used by Codex / Cursor / Copilot / Cline / Aider) — same parent-first walk
  • CLAUDE.local.md (gitignored personal overrides)

If a repo has AGENTS.md already configured for another agent, openHarness picks it up unchanged — no migration step needed.

Skills & Plugins

Skills

Skills are markdown files with YAML frontmatter that add reusable behaviors:

---
name: deploy
description: Deploy the application to production
trigger: deploy
tools: [Bash, Read]
---

Run the deploy script with health checks...

Locations (searched in order):

  1. .oh/skills/ — project-level skills
  2. ~/.oh/skills/ — global skills (available in all projects)

Skills auto-trigger when the user's message contains the trigger keyword, or can be invoked explicitly with /skill deploy.

Plugins

Plugins are npm packages that bundle skills, hooks, and MCP servers:

{
  "name": "my-openharness-plugin",
  "version": "1.0.0",
  "skills": ["skills/deploy.md", "skills/review.md"],
  "hooks": {
    "sessionStart": "scripts/setup.sh"
  },
  "mcpServers": [
    { "name": "my-api", "command": "npx", "args": ["-y", "@my-org/mcp-server"] }
  ]
}

Save as openharness-plugin.json in your npm package root. Install with npm install, and openHarness discovers it automatically from node_modules/.

Evals

oh evals runs SWE-bench-Lite-compatible evaluations against any provider, locally, with mandatory cost caps. Useful for measuring real-world bug-fix performance instead of synthetic benchmarks.

# Run a custom pack with a \$5 total cap, 2 parallel agents
oh evals run my-pack --max-cost-usd 5 --concurrency 2

# Run a specific instance
oh evals run my-pack --max-cost-usd 1 --instance django__django-11551

# Random sample of 3
oh evals run my-pack --max-cost-usd 2 --sample 3

# Resume a partial run that hit its cost cap
oh evals run my-pack --max-cost-usd 10 --resume 2026-05-05T14-30-00

# List installed packs
oh evals list-packs

# Show summary of a past run
oh evals show 2026-05-05T14-30-00

Output lives at ~/.oh/evals/runs/<run-id>/:

  • results.json — full per-task data with cost, turns, duration, tests_status, error_message.
  • predictions.json — submittable to the SWE-bench leaderboard at https://www.swebench.com/.
  • transcripts/<instance_id>.jsonl — verbatim subprocess stream-json output per task.

A pluggable pack contract (pack.json + instances.jsonl + fixtures/<id>/) lets you author packs against any test suite. The scripts/build-evals-pack.mjs helper bakes a SWE-bench-Lite-compatible repo at a given base_commit into a fixture; see CONTRIBUTING.md.

A bundled swe-bench-lite-mini pack (10 cherry-picked instances, ready to run out-of-the-box) is shipping in v2.40.2.

How It Works

graph LR
    User[User Input] --> REPL[REPL Loop]
    REPL --> Query[Query Engine]
    Query --> Provider[LLM Provider]
    Provider --> LLM[Ollama / OpenAI / Anthropic]
    LLM --> Tools[Tool Execution]
    Tools --> Permissions{Permission Check}
    Permissions -->|Approved| Execute[Run Tool]
    Permissions -->|Blocked| Deny[Deny & Report]
    Execute --> Response[Stream Response]
    Response --> REPL

FAQ

Does it work offline? Yes. Use Ollama with a local model — no internet or API key needed.

How much does it cost? Free. OpenHarness is MIT licensed. You bring your own API key (BYOK) for cloud models, or use Ollama for free.

Is it safe? Yes. 7 permission modes control what tools can do. Bash commands are analyzed by an AST parser that blocks destructive patterns (rm -rf, curl | bash, etc.). Every file change is checkpointed and reversible with /rewind.

Can I use it in CI/CD? Yes. Use oh -p "prompt" --auto for headless execution, or the built-in GitHub Action for PR reviews.

Does it support my language/framework? Yes. OpenHarness is language-agnostic — it reads, writes, and executes code in any language. Syntax highlighting covers 20+ languages.

How does it compare to Claude Code? ~95% feature parity for CLI use cases. Main advantage: works with ANY LLM (not just Anthropic) and is MIT-licensed. See Why OpenHarness? above.

Install

Requires Node.js 18+.

# From npm
npm install -g @zhijiewang/openharness

# From source
git clone https://github.com/zhijiewong/openharness.git
cd openharness
npm install && npm install -g .

Development

npm install
npx tsx src/main.tsx              # run in dev mode
npx tsc --noEmit                  # type check
npm test                          # run tests

Adding a tool

Create src/tools/YourTool/index.ts implementing the Tool interface with a Zod input schema, register it in src/tools.ts.

Adding a provider

Create src/providers/yourprovider.ts implementing the Provider interface, add a case in src/providers/index.ts.

Contributing

See CONTRIBUTING.md.

Community

Join the OpenHarness community to get help, share your workflows, and discuss the future of AI coding agents!

PlatformDetails & Links
🟣 DiscordJoin our Discord Server to chat with developers and get real-time support.
🔵 Feishu / LarkScan the QR code below to collaborate with the community:

Feishu Group QR Code
🟢 WeChatScan the QR code below to join our WeChat group:

WeChat Group QR Code

License

MIT