Write-enabled task

April 22, 2026 ยท View on GitHub

Claw Code Agent logo

Claw Code Agent

A Python reimplementation of the Claude Code agent architecture โ€” local models, full control, zero dependencies.

Python 3.10+ GitHub vLLM Qwen3-Coder Zero Dependencies Alpha License


๐Ÿ“ข What's New

April 2026 โ€” Major Update

FeatureDetails
๐Ÿ†•Interactive Chat ModeNew agent-chat command โ€” multi-turn REPL with /exit to quit
๐Ÿ†•Streaming OutputToken-by-token streaming with --stream flag
๐Ÿ†•Plugin RuntimeFull manifest-based plugin system โ€” hooks, tool aliases, virtual tools, tool blocking
๐Ÿ†•Nested Agent DelegationDelegate subtasks to child agents with dependency-aware topological batching
๐Ÿ†•Agent ManagerLineage tracking, group membership, batch summaries for nested agents
๐Ÿ†•Custom Agent ProfilesDiscover local markdown-defined agents from ~/.claude/agents and ./.claude/agents and use them through the Agent tool
๐Ÿ†•Cost Tracking & BudgetsToken budgets, cost budgets, tool-call limits, model-call limits, session-turn limits
๐Ÿ†•Structured OutputJSON schema response mode with --response-schema-file
๐Ÿ†•Context CompactionAuto-snip, auto-compact, and reactive compaction on prompt-too-long errors
๐Ÿ†•File History ReplayJournaling of file edits with snapshot IDs, replay summaries on session resume
๐Ÿ†•Truncation ContinuationAutomatic continuation when model response is cut off (finish_reason=length)
๐Ÿ†•Ollama SupportWorks out of the box with Ollama's OpenAI-compatible API
๐Ÿ†•LiteLLM Proxy SupportRoute through LiteLLM Proxy to any provider
๐Ÿ†•OpenRouter SupportCloud API gateway โ€” access OpenAI, Anthropic, Google models via one endpoint
๐Ÿ†•Query EngineRuntime event counters, transcript summaries, orchestration reports
๐Ÿ†•Remote RuntimeManifest-backed local remote profiles, connect/disconnect state, and remote CLI/slash flows
๐Ÿ†•Hook & Policy RuntimeLocal .claw-policy.json / hook manifests with trust reporting, safe env, tool blocking, and budget overrides
๐Ÿ†•Task & Plan RuntimePersistent local tasks and plans with plan-to-task sync and dependency-aware task execution
๐Ÿ†•MCP TransportReal stdio MCP transport for initialize, resource listing/reading, and tool listing/calling
๐Ÿ†•Search RuntimeProvider-backed web_search with local manifests, activation state, and /search flows
๐Ÿ†•Config & Account RuntimeLocal config/settings mutation plus manifest-backed account profiles and login/logout state
๐Ÿ†•Ask-User RuntimeQueued or interactive local ask-user flow with history, slash commands, and agent tool support
๐Ÿ†•Team RuntimePersisted local teams and message history with team/message tools and slash/CLI inspection
๐Ÿ†•Notebook Edit ToolNative .ipynb cell editing through the real agent tool registry
๐Ÿ†•Workflow RuntimeManifest-backed local workflows with workflow tools, slash commands, and run history
๐Ÿ†•Remote Trigger RuntimeLocal remote triggers with create/update/run flows similar to the npm remote trigger surface
๐Ÿ†•Worktree RuntimeManaged git worktrees with mid-session cwd switching, slash commands, and CLI flows
๐Ÿ†•Tokenizer-Aware ContextCached tokenizer backends with heuristic fallback for /context, /status, and compaction
๐Ÿ†•Prompt Budget PreflightPreflight prompt-length validation, token-budget reporting, and auto-compact/context collapse before backend failures
๐Ÿ†•LSP RuntimeLocal LSP-style code intelligence for definitions, references, hover, symbols, call hierarchy, and diagnostics
๐Ÿ†•Local Web GUIBrowser-based chat UI via python -m src.gui โ€” modern dark theme, slash command palette, session browser, settings panel
๐Ÿ†•Pasted-Content RefsPastes โ‰ฅ500 chars into the GUI composer collapse to [Pasted text #N +M lines] chips and re-expand server-side before the agent runs
๐Ÿ†•GUI Runtime KnobsSettings panel exposes temperature, per-turn timeout, streaming toggle, and max-turns โ€” all round-tripped live through /api/state
๐Ÿ†•GUI Budgets & LimitsAdvanced settings disclosure for every BudgetConfig field: cost ceiling, token budgets, tool/model call caps, delegated task cap, session turn cap โ€” blank input clears the limit
๐Ÿ†•GUI System Prompt & SchemaCustom / append / override system prompts and a structured-output JSON schema editor (with strict toggle) live-editable in the settings panel
๐Ÿ†•GUI Context ManagementAuto-snip / auto-compact thresholds, compact-preserve count, CLAUDE.md discovery toggle, and additional working directories โ€” all editable from the settings panel and the new --auto-snip-threshold / --auto-compact-threshold / --add-dir flags
๐Ÿ†•GUI Tasks ViewBrowse, create, start, complete, and cancel local tasks from a new Tasks tab; mutations call straight into TaskRuntime so completing a task auto-unblocks dependents just like the slash-command path
๐Ÿ†•GUI Plan ViewEdit the local porting plan (steps + explanation + per-step status/priority) from a new Plan tab; saves go through PlanRuntime.update_plan and optionally sync to the task list
๐Ÿ†•GUI Memory ViewBrowse, edit, create, and delete the discovered CLAUDE.md / .claude/rules/*.md memory files from a new Memory tab; writes are sandboxed to the workspace + ~/.claude
๐Ÿ†•GUI File History ViewNew History tab aggregates file_history entries from every saved session (newest first) โ€” one row per shell run / file edit / nested agent call with snapshot ids and changed paths
๐Ÿ†•GUI Background SessionsNew Background tab lists detached agent-bg runs (running/exited/completed/failed), shows live logs, and lets you kill a running session โ€” same BackgroundSessionRuntime the CLI uses
๐Ÿ†•GUI Worktree ViewNew Worktree tab โ€” show status & history, create a managed git worktree (auto-switches the agent's cwd), and exit it (keep or remove); state survives reload via WorktreeRuntime
๐Ÿ†•GUI Skills MarketplaceNew Skills tab โ€” card grid of every bundled skill with description, when-to-use, aliases, and allowed tools; "Use in chat" button drops the invocation into the composer
๐Ÿ†•GUI Accounts ViewNew Accounts tab โ€” discover profiles from .claude/account.json, log in by name or with an ephemeral identity, view login/logout history; persists into AccountRuntime state
๐Ÿ†•GUI Remote ProfilesNew Remote tab โ€” discover remote/SSH/teleport/direct-connect/deep-link profiles from .claw-remote.json etc., connect by name or ephemeral target, view connect/disconnect history
๐Ÿ†•GUI MCP ServersNew MCP tab โ€” list discovered servers/resources/tools from .claw-mcp.json/.mcp.json, read inline + stdio resources, call tools with custom JSON args; "Probe stdio servers" toggle controls subprocess cost
๐Ÿ†•GUI Plugins ViewNew Plugins tab โ€” list manifests from .claw-plugin/plugin.json, .codex-plugin/plugin.json, and plugins/*/plugin.json with their tools, virtual tools, aliases, blocks, and lifecycle hooks
๐Ÿ†•GUI Ask-User QueueNew Ask tab โ€” preload answers (exact or contains match), browse the queue and history, and clear past entries; the agent's Ask tool consumes them straight from .port_sessions/ask_user_runtime.json
๐Ÿ†•GUI Workflows ViewNew Workflows tab โ€” list discovered workflow definitions from .claw-workflows.json, trigger a recorded run with custom JSON arguments, browse run history
๐Ÿ†•GUI Search ViewNew Search tab โ€” discover providers from .claw-search.json/.claude/search.json, activate one, and run live SearXNG/Brave/Tavily queries straight from the browser
๐Ÿ†•GUI Remote TriggersNew Triggers tab โ€” list/create/run remote triggers (manifest-defined or local), record run history; mirrors RemoteTriggerRuntime exactly
๐Ÿ†•GUI Teams ViewNew Teams tab โ€” create teams with members, send messages between them, view full message history; persisted via TeamRuntime
๐Ÿ†•GUI Diagnostics TabNew Diag tab โ€” render the existing markdown reports (summary, manifest, parity-audit, setup-report, command-graph, tool-pool, bootstrap-graph) on demand without shelling out
๐Ÿ†•Daemon CommandsLocal daemon start/ps/logs/attach/kill wrapper over background agent sessions
๐Ÿ†•Background SessionsLocal agent-bg, agent-ps, agent-logs, agent-attach, and agent-kill flows
๐Ÿ†•Testing GuideComprehensive TESTING_GUIDE.md with commands for every feature
๐Ÿ†•Parity ChecklistFull PARITY_CHECKLIST.md tracking implementation status vs npm source

๐Ÿ“– About

This repository reimplements the Claude Code npm agent architecture entirely in Python, designed to run with local open-source models via an OpenAI-compatible API server.

Built on the public porting workspace from instructkr/claw-code, the active development lives at HarnessLab/claw-code-agent.

Goal: Not to ship the original npm source, but to reimplement the full agent flow in Python โ€” prompt assembly, context building, slash commands, tool calling, session persistence, and local model execution.

Zero external dependencies โ€” just Python's standard library.

Claw Code Agent demo


โœจ Key Features

FeatureDescription
๐Ÿค– Agent LoopFull agentic coding loop with tool calling and iterative reasoning
๐Ÿ’ฌ Interactive ChatMulti-turn REPL via agent-chat with session continuity
๐Ÿ–ฅ๏ธ Local Web GUIBrowser-based chat UI launched with python -m src.gui โ€” sessions browser, slash command palette, live settings
๐Ÿงฐ Core ToolsFile read / write / edit, glob search, grep search, shell execution
๐Ÿ”Œ Plugin RuntimeManifest-based plugins with hooks, aliases, virtual tools, and tool blocking
๐Ÿช† Nested DelegationDelegate subtasks to child agents with dependency-aware topological batching
๐Ÿงฉ Custom AgentsLoad local agent profiles from ~/.claude/agents and ./.claude/agents, inspect them via /agents, and delegate with subagent_type
๐Ÿ“ก StreamingToken-by-token streaming output with --stream
๐Ÿ’ฌ Slash CommandsLocal commands for context, config, account, search, MCP, remote, tasks, plan, hooks, and model control
๐ŸŒ Remote RuntimeManifest-backed remote profiles with local remote-mode, ssh-mode, teleport-mode, and connect/disconnect state
๐Ÿงญ Task & Plan RuntimePersistent tasks and plans with sync, next-task selection, and blocked/unblocked state
๐Ÿ›ฐ๏ธ MCP RuntimeLocal MCP manifests plus real stdio MCP transport for resources and tools
๐Ÿ”Ž Search RuntimeProvider-backed web_search plus provider activation and status reporting
โš™๏ธ Config & Account RuntimeLocal config mutation, settings inspection, account profiles, and login/logout state
๐Ÿ™‹ Ask-User RuntimeQueued answer or interactive user-question flow with history tracking
๐Ÿ‘ฅ Team RuntimePersisted local teams plus message history, handoff notes, and collaboration metadata
๐Ÿ““ Notebook EditingNative Jupyter notebook cell editing through notebook_edit
๐Ÿชต Worktree RuntimeManaged git worktrees with worktree_enter, worktree_exit, and live cwd switching
๐Ÿงญ Workflow RuntimeManifest-backed workflows with slash commands, CLI inspection, and recorded runs
โฐ Remote TriggersLocal remote triggers with create/update/run flows and npm-style trigger actions
๐Ÿช Hook & Policy RuntimeTrust reporting, safe env, managed settings, tool blocking, and budget overrides
๐Ÿง  LSP Code IntelligenceLocal LSP-style definitions, references, hover, symbols, diagnostics, and call hierarchy
๐Ÿง  Context EngineAutomatic context building with CLAUDE.md discovery, compaction, and snipping
๐Ÿ”ข Tokenizer-Aware AccountingModel-aware token counting with cached tokenizer backends and fallback heuristics
๐Ÿ“ Prompt BudgetingSoft/hard prompt-window checks, token-budget reports, and preflight context collapse
๐Ÿ”„ Session PersistenceSave and resume agent sessions with file-history replay
๐Ÿ—‚๏ธ Background Sessionsagent-bg and local daemon wrappers for background runs, logs, attach, and kill
๐Ÿ’ฐ Cost & Budget ControlToken budgets, cost limits, tool-call caps, model-call caps
๐Ÿ“‹ Structured OutputJSON schema response mode for programmatic use
๐Ÿ” Permission SystemGranular control: --allow-write, --allow-shell, --unsafe
๐Ÿ—๏ธ OpenAI-CompatibleWorks with vLLM, Ollama, LiteLLM Proxy, OpenRouter โ€” any OpenAI-compatible API
๐Ÿ‰ Qwen3-CoderFirst-class support for Qwen3-Coder-30B-A3B-Instruct via vLLM
๐Ÿ“ฆ Zero DependenciesPure Python standard library โ€” nothing to install

๐Ÿ“‹ Roadmap

๐Ÿ“š Documentation

DocumentDescription
TESTING_GUIDE.mdStep-by-step commands to verify every feature
PARITY_CHECKLIST.mdFull implementation status vs the npm source

โœ… Done

  • Python CLI agent loop
  • Interactive chat mode (agent-chat) with multi-turn REPL
  • OpenAI-compatible local model backend
  • Qwen3-Coder support through vLLM with qwen3_xml tool parser
  • Ollama, LiteLLM Proxy, and OpenRouter backends
  • Core tools: list_dir, read_file, write_file, edit_file, glob_search, grep_search, bash
  • Context building and /context-style usage reporting
  • Slash commands: /help, /context, /context-raw, /token-budget, /prompt, /permissions, /model, /tools, /agents, /memory, /status, /clear
  • Session persistence and agent-resume flow
  • Permission system (read-only, write, shell, unsafe tiers)
  • Streaming token-by-token assistant output
  • Truncated-response continuation flow
  • Auto-snip and auto-compact context reduction
  • Reactive compaction retry on prompt-too-long errors
  • Preflight prompt-length validation and token-budget reporting
  • Preflight auto-compact/context collapse before backend prompt-too-long failures
  • Cost tracking and usage budget enforcement
  • Token, tool-call, model-call, and session-turn budgets
  • Structured output / JSON schema response mode
  • File history journaling with snapshot IDs and replay summaries
  • Nested agent delegation with dependency-aware topological batching
  • Agent manager with lineage tracking and group membership
  • Filesystem-backed custom agent profiles with built-in/user/project precedence
  • Local custom-agent create/update/delete flows via CLI and /agents
  • Local daemon-style background command family
  • Local background session workflows: agent-bg, agent-ps, agent-logs, agent-attach, agent-kill
  • Local remote runtime: manifest discovery, profile listing, connect/disconnect persistence, and CLI/slash flows
  • Local hook and policy runtime with trust reporting, safe env, tool blocking, and budget overrides
  • Local config runtime: config discovery, effective settings, source inspection, and config mutation
  • Local LSP runtime: definitions, references, hover, symbols, diagnostics, and call hierarchy
  • Local account runtime: profile discovery, login/logout state, and account CLI/slash flows
  • Local ask-user runtime: queued answers, history, and ask-user CLI/slash flows
  • Local team runtime: persisted teams, team messages, and team CLI/slash flows
  • Local search runtime with provider discovery, activation, and provider-backed web_search
  • Local MCP runtime: manifest resources, stdio transport, MCP resources, and MCP tool calls
  • Local task and plan runtimes with plan sync and dependency-aware task execution
  • Notebook edit tool in the real Python tool registry
  • Local workflow runtime with workflow list/get/run tools and CLI/slash flows
  • Local remote trigger runtime with create/update/run flows and CLI/slash inspection
  • Local managed git worktree runtime with live cwd switching and worktree CLI/slash flows
  • Local web GUI (FastAPI + vanilla JS SPA) with chat, sessions browser, slash command palette, and live settings (python -m src.gui)
  • Tokenizer-aware context accounting with cached tokenizer backends and heuristic fallback
  • Plugin runtime: manifest discovery, hooks, aliases, virtual tools, tool blocking
  • Plugin lifecycle hooks: resume, persist, delegate phases
  • Plugin session-state persistence and resume restoration
  • Query engine facade driving the real Python runtime
  • Compaction metadata with lineage IDs and revision summaries
  • Extended runtime tools: web_fetch, web_search, tool_search, sleep
  • Unit tests for the Python runtime
  • pyproject.toml packaging with setuptools

๐Ÿ”ฒ In Progress

  • Full MCP parity beyond the current stdio transport and local manifest/resource/tool support
  • Full slash-command parity with npm runtime
  • Full interactive REPL / TUI behavior
  • Full tokenizer/chat-message framing parity beyond the current tokenizer-aware accounting
  • Hooks system parity
  • Real remote transport/runtime parity beyond the current local remote-profile runtime
  • Voice and VIM modes
  • Editor and platform integrations
  • Background and team features

๐Ÿ—๏ธ Architecture

claw-code/
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ TESTING_GUIDE.md              # How to test every feature
โ”œโ”€โ”€ PARITY_CHECKLIST.md           # Implementation status vs npm source
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ images/
โ”‚   โ””โ”€โ”€ logo.png
โ”œโ”€โ”€ src/                          # Python implementation
โ”‚   โ”œโ”€โ”€ main.py                   # CLI entry point & argument parsing
โ”‚   โ”œโ”€โ”€ agent_runtime.py          # Core agent loop (LocalCodingAgent)
โ”‚   โ”œโ”€โ”€ agent_tools.py            # Tool definitions & execution engine
โ”‚   โ”œโ”€โ”€ agent_prompting.py        # System prompt assembly
โ”‚   โ”œโ”€โ”€ agent_registry.py         # Built-in + filesystem-backed custom agent discovery
โ”‚   โ”œโ”€โ”€ agent_context.py          # Context building & CLAUDE.md discovery
โ”‚   โ”œโ”€โ”€ agent_context_usage.py    # Context usage estimation & reporting
โ”‚   โ”œโ”€โ”€ agent_session.py          # Session state management
โ”‚   โ”œโ”€โ”€ agent_slash_commands.py   # Local slash command processing
โ”‚   โ”œโ”€โ”€ agent_manager.py          # Nested agent lineage & group tracking
โ”‚   โ”œโ”€โ”€ agent_types.py            # Shared dataclasses & type definitions
โ”‚   โ”œโ”€โ”€ openai_compat.py          # OpenAI-compatible API client (streaming)
โ”‚   โ”œโ”€โ”€ plugin_runtime.py         # Plugin manifest, hooks, aliases, virtual tools
โ”‚   โ”œโ”€โ”€ agent_plugin_cache.py     # Plugin discovery & prompt injection cache
โ”‚   โ”œโ”€โ”€ session_store.py          # Session serialization & persistence
โ”‚   โ”œโ”€โ”€ transcript.py             # Transcript block export & mutation tracking
โ”‚   โ”œโ”€โ”€ query_engine.py           # Query engine facade & runtime orchestration
โ”‚   โ”œโ”€โ”€ mcp_runtime.py            # Local MCP discovery and stdio MCP transport
โ”‚   โ”œโ”€โ”€ search_runtime.py         # Search providers and provider-backed web_search
โ”‚   โ”œโ”€โ”€ remote_runtime.py         # Local remote profiles, connect/disconnect state, remote CLI support
โ”‚   โ”œโ”€โ”€ background_runtime.py     # Local background sessions and daemon support
โ”‚   โ”œโ”€โ”€ account_runtime.py        # Local account profiles, login/logout state, account CLI support
โ”‚   โ”œโ”€โ”€ ask_user_runtime.py       # Local ask-user queued answers and interaction history
โ”‚   โ”œโ”€โ”€ config_runtime.py         # Local workspace config/settings discovery and mutation
โ”‚   โ”œโ”€โ”€ lsp_runtime.py            # Local LSP-style code intelligence and diagnostics
โ”‚   โ”œโ”€โ”€ token_budget.py           # Prompt-window budgeting and preflight prompt-length validation
โ”‚   โ”œโ”€โ”€ plan_runtime.py           # Persistent plan runtime and plan sync
โ”‚   โ”œโ”€โ”€ task_runtime.py           # Persistent task runtime and task execution
โ”‚   โ”œโ”€โ”€ task.py                   # Task state model and task dataclasses
โ”‚   โ”œโ”€โ”€ team_runtime.py           # Local teams, messages, and collaboration metadata
โ”‚   โ”œโ”€โ”€ workflow_runtime.py       # Local workflow manifests and recorded workflow runs
โ”‚   โ”œโ”€โ”€ remote_trigger_runtime.py # Local remote trigger manifests and trigger run history
โ”‚   โ”œโ”€โ”€ worktree_runtime.py       # Managed git worktree sessions and cwd switching
โ”‚   โ”œโ”€โ”€ hook_policy.py            # Hook/policy manifests, trust, and safe env handling
โ”‚   โ”œโ”€โ”€ tokenizer_runtime.py      # Tokenizer-aware context accounting backends
โ”‚   โ”œโ”€โ”€ permissions.py            # Tool permission filtering
โ”‚   โ”œโ”€โ”€ cost_tracker.py           # Cost & budget enforcement
โ”‚   โ”œโ”€โ”€ commands.py               # Mirrored command inventory
โ”‚   โ”œโ”€โ”€ tools.py                  # Mirrored tool inventory
โ”‚   โ”œโ”€โ”€ runtime.py                # Mirrored runtime facade
โ”‚   โ”œโ”€โ”€ reference_data/           # Mirrored inventory snapshots
โ”‚   โ””โ”€โ”€ gui/                      # Local web GUI (FastAPI + vanilla JS SPA)
โ”‚       โ”œโ”€โ”€ __main__.py           # `python -m src.gui` entry point
โ”‚       โ”œโ”€โ”€ server.py             # FastAPI app and JSON endpoints
โ”‚       โ””โ”€โ”€ static/               # index.html, app.css, app.js
โ””โ”€โ”€ tests/                        # Unit tests
    โ”œโ”€โ”€ test_agent_runtime.py
    โ”œโ”€โ”€ test_agent_context.py
    โ”œโ”€โ”€ test_agent_context_usage.py
    โ”œโ”€โ”€ test_agent_prompting.py
    โ”œโ”€โ”€ test_agent_slash_commands.py
    โ”œโ”€โ”€ test_main.py
    โ”œโ”€โ”€ test_query_engine_runtime.py
    โ””โ”€โ”€ test_porting_workspace.py

๐Ÿ“ฆ Requirements

RequirementDetails
๐Ÿ Python3.10 or higher
๐Ÿ“š DependenciesNone โ€” pure Python standard library
๐Ÿ–ฅ๏ธ Model ServervLLM, Ollama, LiteLLM Proxy, or OpenRouter, with tool calling support
๐Ÿง  ModelQwen/Qwen3-Coder-30B-A3B-Instruct (recommended)

๐Ÿš€ Quick Start

1. Start vLLM with Qwen3-Coder

vLLM must be started with automatic tool choice enabled. Use the qwen3_xml parser for Qwen3-Coder tool calling:

python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen3-Coder-30B-A3B-Instruct \
  --host 127.0.0.1 \
  --port 8000 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_xml

Verify the server is running:

curl http://127.0.0.1:8000/v1/models

๐Ÿ“š References: vLLM Tool Calling Docs ยท OpenAI-Compatible Server

Optional: Use Ollama Instead of vLLM

claw-code-agent can also work with Ollama because the runtime targets an OpenAI-compatible API. Use a model that supports tool calling well.

Example:

ollama serve
ollama pull qwen3

Then configure:

export OPENAI_BASE_URL=http://127.0.0.1:11434/v1
export OPENAI_API_KEY=ollama
export OPENAI_MODEL=qwen3

Notes:

  • prefer tool-capable models such as qwen3
  • plain chat-only models are not enough for full agent behavior
  • Ollama does not use the vLLM parser flags shown above

๐Ÿ“š References: Ollama OpenAI Compatibility ยท Ollama Tool Calling

Optional: Use LiteLLM Proxy

claw-code-agent can also work through LiteLLM Proxy because the runtime targets an OpenAI-compatible chat completions API. The routed model still needs to support tool calling for full agent behavior.

Quick start example:

pip install 'litellm[proxy]'
litellm --model ollama/qwen3

LiteLLM Proxy runs on port 4000 by default. Then configure:

export OPENAI_BASE_URL=http://127.0.0.1:4000
export OPENAI_API_KEY=anything
export OPENAI_MODEL=ollama/qwen3

Notes:

  • LiteLLM Proxy gives you an OpenAI-style gateway in front of many providers
  • tool use still depends on the underlying routed model and provider behavior
  • if you configure a LiteLLM master key, use that instead of anything

๐Ÿ“š References: LiteLLM Docs ยท LiteLLM Proxy Quick Start

Optional: Use OpenRouter

claw-code-agent can also work with OpenRouter, a cloud API gateway that provides access to models from OpenAI, Anthropic, Google, Meta, and others through a single OpenAI-compatible endpoint. No local model server required.

Configure:

export OPENAI_BASE_URL=https://openrouter.ai/api/v1
export OPENAI_API_KEY=sk-or-v1-your-key-here
export OPENAI_MODEL=openai/gpt-4o-mini

Notes:

  • sign up at openrouter.ai and create an API key under Keys
  • model names use the provider/model format (e.g. anthropic/claude-sonnet-4, openai/gpt-4o, google/gemini-2.5-pro)
  • tool calling support varies by model โ€” check the model list for capabilities
  • this sends your conversation (including file contents and shell output) to OpenRouter and the upstream provider โ€” do not use with repos containing secrets or sensitive data

๐Ÿ“š References: OpenRouter Docs ยท Supported Models ยท API Keys

2. Configure Environment

export OPENAI_BASE_URL=http://127.0.0.1:8000/v1
export OPENAI_API_KEY=local-token
export OPENAI_MODEL=Qwen/Qwen3-Coder-30B-A3B-Instruct

Use Another Model With vLLM

If you want to try another model, keep the same vLLM server setup and change the --model value when you launch vLLM.

Example:

python -m vllm.entrypoints.openai.api_server \
  --model your-model-name \
  --host 127.0.0.1 \
  --port 8000 \
  --enable-auto-tool-choice \
  --tool-call-parser your_parser

Then update:

export OPENAI_MODEL=your-model-name

Notes:

  • the documented path in this repository is vLLM
  • the model must support tool calling well enough for agent use
  • some model families require a different --tool-call-parser
  • slash commands such as /help, /context, and /tools are local and do not require the model server

3. Run the Agent

# Read-only question
python3 -m src.main agent \
  "Read src/agent_runtime.py and summarize how the loop works." \
  --cwd .

# Write-enabled task
python3 -m src.main agent \
  "Create TEST_QWEN_AGENT.md with one line: test ok" \
  --cwd . --allow-write

# Shell-enabled task
python3 -m src.main agent \
  "Run pwd and ls src, then summarize the result." \
  --cwd . --allow-shell

# Interactive chat mode
python3 -m src.main agent-chat --cwd .

# Streaming output
python3 -m src.main agent \
  "Explain the current architecture." \
  --cwd . --stream

๐Ÿ› ๏ธ Usage

Agent Commands

CommandDescription
agent <prompt>Run the agent with a prompt
agent-chat [prompt]Start interactive multi-turn chat mode
agent-bg <prompt>Run the agent in a local background session
agent-psList local background sessions
agent-logs <id>Show background session logs
agent-attach <id>Show the current background output snapshot
agent-kill <id>Stop a background session
daemon <subcommand>Daemon-style wrapper over local background sessions
agent-promptShow the assembled system prompt
agent-contextShow estimated context usage
agent-context-rawShow the raw context snapshot
token-budgetShow prompt-window budget, reserves, and soft/hard input limits
agents [agent_type]List active local agent definitions or show one agent profile
agents-create <agent_type>Create a project or user agent definition markdown file
agents-update <agent_type>Update an existing project or user agent definition
agents-delete <agent_type>Delete an existing project or user agent definition
agent-resume <id> <prompt>Resume a saved session

Runtime Utility Commands

CommandDescription
search-status / search-providers / search-activate / searchInspect and use the local search runtime
mcp-status / mcp-resources / mcp-resource / mcp-tools / mcp-call-toolInspect and use the local MCP runtime
remote-status / remote-profiles / remote-disconnectInspect local remote runtime state
remote-mode / ssh-mode / teleport-mode / direct-connect-mode / deep-link-modeActivate local remote runtime modes
config-status / config-effective / config-source / config-get / config-setInspect and mutate local config/settings
account-status / account-profiles / account-login / account-logoutInspect and mutate local account state

CLI Flags

FlagDescription
--cwd <path>Set the workspace directory
--model <name>Override the model name
--base-url <url>Override the API base URL
--allow-writeAllow the agent to modify files
--allow-shellAllow the agent to execute shell commands
--unsafeAllow destructive shell operations
--streamEnable token-by-token streaming output
--show-transcriptPrint the full message transcript
--scratchpad-root <path>Override the scratchpad directory
--system-prompt <text>Set a custom system prompt
--append-system-prompt <text>Append to the system prompt
--override-system-prompt <text>Replace the generated system prompt
--add-dir <path>Add extra directories to context

Budget & Limit Flags

FlagDescription
--max-total-tokens <n>Total token budget
--max-input-tokens <n>Input token budget
--max-output-tokens <n>Output token budget
--max-reasoning-tokens <n>Reasoning token budget
--max-budget-usd <n>Maximum cost in USD
--max-tool-calls <n>Maximum tool calls per run
--max-delegated-tasks <n>Maximum delegated subtasks
--max-model-calls <n>Maximum model API calls
--max-session-turns <n>Maximum session turns
--input-cost-per-million <n>Input token pricing
--output-cost-per-million <n>Output token pricing

Context Control Flags

FlagDescription
--auto-snip-threshold <n>Auto-snip older messages at this token count
--auto-compact-threshold <n>Auto-compact at this token count
--compact-preserve-messages <n>Messages to preserve during compaction
--disable-claude-mdDisable CLAUDE.md discovery

Structured Output Flags

FlagDescription
--response-schema-file <path>JSON schema file for structured output
--response-schema-name <name>Schema name identifier
--response-schema-strictEnforce strict schema validation

Slash Commands

These are handled locally before the model loop:

CommandAliasesDescription
/help/commandsShow built-in slash commands
/context/usageShow estimated session context usage
/context-raw/envShow raw environment & context snapshot
/token-budget/budgetShow prompt-window budget, reserves, and soft/hard input limits
/mcpโ€”Show MCP runtime status, tools, or a single MCP tool
/resourcesโ€”List MCP resources
/resourceโ€”Read an MCP resource by URI
/searchโ€”Show search status, providers, activate a provider, or run a search
/remoteโ€”Show local remote status or activate a target
/remotesโ€”List local remote profiles
/sshโ€”Activate an SSH-style remote profile
/teleportโ€”Activate a teleport-style remote profile
/direct-connectโ€”Activate a direct-connect remote profile
/deep-linkโ€”Activate a deep-link remote profile
/disconnect/remote-disconnectDisconnect the active remote runtime target
/accountโ€”Show account runtime status or profiles
/loginโ€”Activate a local account profile or identity
/logoutโ€”Clear the active account session
/config/settingsInspect effective config, sources, or a single config value
/plan/plannerShow the local plan runtime state
/tasks/todoShow the local task list
/taskโ€”Show a task by id
/task-next/next-taskShow the next actionable tasks
/prompt/system-promptRender the effective system prompt
/hooks/policyShow local hook/policy manifests
/trustโ€”Show trust mode, managed settings, and safe env values
/permissionsโ€”Show active tool permission mode
/modelโ€”Show or update the active model
/toolsโ€”List registered tools with permission status
/agentsโ€”List, show, create, update, or delete local agent definitions
/memoryโ€”Show loaded CLAUDE.md memory bundle
/status/sessionShow runtime/session status summary
/clearโ€”Clear ephemeral runtime state
python3 -m src.main agent "/help"
python3 -m src.main agent "/context" --cwd .
python3 -m src.main agent "/token-budget" --cwd .
python3 -m src.main agent "/tools" --cwd .
python3 -m src.main agent "/agents" --cwd .
python3 -m src.main agent "/status" --cwd .

Custom Agent Definitions

Custom agent profiles can live in either of these directories:

  • ./.claude/agents/*.md
  • ~/.claude/agents/*.md

Project agents override user agents, and user agents override built-ins when the agent_type matches.

Example agent file:

---
name: reviewer
description: "Review implementation changes carefully."
tools: read_file, grep_search
model: Qwen/Qwen3-Coder-30B-A3B-Instruct
initialPrompt: Start by identifying the highest-risk files.
---

Inspect code changes and summarize correctness risks, regressions, and missing tests.

Inspect the loaded profiles:

python3 -m src.main agents --cwd .
python3 -m src.main agents reviewer --cwd .
python3 -m src.main agent "/agents" --cwd .
python3 -m src.main agent "/agents show reviewer" --cwd .

Create, update, or delete agent files from the CLI:

python3 -m src.main agents-create reviewer \
  --cwd . \
  --description "Review implementation changes carefully." \
  --prompt "Inspect code changes and summarize risks." \
  --tools read_file,grep_search \
  --model Qwen/Qwen3-Coder-30B-A3B-Instruct

python3 -m src.main agents-update reviewer \
  --cwd . \
  --description "Review implementation changes and tests carefully." \
  --prompt "Focus on regressions, missing tests, and risky diffs."

python3 -m src.main agents-delete reviewer --cwd . --source project

Or use the local slash command management forms:

python3 -m src.main agent "/agents create reviewer :: Review implementation changes carefully. :: Inspect code changes and summarize risks." --cwd .
python3 -m src.main agent "/agents update reviewer Updated review description :: Focus on regressions and missing tests." --cwd .
python3 -m src.main agent "/agents delete reviewer" --cwd .

Utility Commands

python3 -m src.main summary            # Workspace summary
python3 -m src.main manifest           # Workspace manifest
python3 -m src.main commands --limit 10 # Command inventory
python3 -m src.main tools --limit 10    # Tool inventory

๐Ÿ”ง Built-in Tools

The runtime currently includes core and extended tools:

ToolDescriptionPermission
list_dirList files and directories๐ŸŸข Always
read_fileRead file contents (with line ranges)๐ŸŸข Always
write_fileWrite or create files๐ŸŸก --allow-write
edit_fileEdit files via exact string matching๐ŸŸก --allow-write
glob_searchFind files by glob pattern๐ŸŸข Always
grep_searchSearch file contents by regex๐ŸŸข Always
bashExecute shell commands๐Ÿ”ด --allow-shell
web_fetchFetch local or remote text content by URL๐ŸŸข Always
search_status / search_list_providers / search_activate_provider / web_searchSearch runtime status and provider-backed web search๐ŸŸข Always
tool_searchSearch the current Python tool registry๐ŸŸข Always
sleepBounded local wait tool๐ŸŸข Always
config_list / config_get / config_setInspect and mutate local workspace configconfig_set is ๐ŸŸก --allow-write
account_status / account_list_profiles / account_login / account_logoutInspect and mutate local account state๐ŸŸข Always
remote_status / remote_list_profiles / remote_connect / remote_disconnectInspect and mutate local remote runtime state๐ŸŸข Always
mcp_list_resources / mcp_read_resource / mcp_list_tools / mcp_call_toolUse local MCP resources and transport-backed MCP tools๐ŸŸข Always
plan_get / update_plan / plan_clearInspect and mutate the local plan runtimeupdate_plan is ๐ŸŸก --allow-write
task_next / task_list / task_get / task_create / task_update / task_start / task_complete / task_block / task_cancel / todo_writePersistent local task and todo managementwrite-like task mutations are ๐ŸŸก --allow-write
delegate_agentDelegate work to nested child agents๐ŸŸข Always

๐Ÿ”Œ Plugin System

Claw Code Agent supports a manifest-based plugin runtime. Drop a plugin.json in a plugins/ subdirectory:

{
  "name": "my-plugin",
  "hooks": {
    "beforePrompt": "Inject guidance into the system prompt.",
    "afterTurn": "Run after each agent turn.",
    "onResume": "Reapply state on session resume.",
    "beforePersist": "Save state before session is saved.",
    "beforeDelegate": "Inject guidance before child agents.",
    "afterDelegate": "Process child agent results."
  },
  "toolAliases": [
    { "name": "my_read", "baseTool": "read_file", "description": "Custom read alias." }
  ],
  "virtualTools": [
    { "name": "my_tool", "description": "A virtual tool.", "responseTemplate": "result: {input}" }
  ]
}

See TESTING_GUIDE.md Section 19 for full plugin testing commands.


๐Ÿช† Nested Agent Delegation

The agent can delegate subtasks to child agents with full context carryover:

python3 -m src.main agent \
  "Delegate a subtask to inspect src/agent_runtime.py and return a summary." \
  --cwd . --show-transcript

Features:

  • Sequential and parallel subtask execution
  • Dependency-aware topological batching
  • Child-session save and resume
  • Agent manager lineage tracking

See TESTING_GUIDE.md Section 20 for delegation testing commands.


๐Ÿ–ฅ๏ธ Local Web GUI

If the terminal isn't your thing, launch the bundled browser GUI:

python3 -m src.gui --cwd . --allow-write --allow-shell

Your default browser opens to http://127.0.0.1:8765 with a modern dark-themed chat UI.

FlagDescription
--host <addr>Bind address (default 127.0.0.1)
--port <n>Port to listen on (default 8765)
--cwd <path>Workspace directory the agent operates in
--model <name>Override the model name
--base-url <url>Override the OpenAI-compatible API base URL
--api-key <key>API key for the model server
--session-dir <path>Where saved sessions live
--allow-writeAllow file write/edit tools
--allow-shellAllow shell execution
--temperature <f>Sampling temperature (default 0.0)
--timeout-seconds <f>Per-turn model timeout in seconds (default 120)
--streamEnable streaming model responses
--max-turns <n>Per-run turn limit (default 12)
--max-budget-usd <f>Abort the run if total cost exceeds this
--max-total-tokens <n>Token budget across prompt + completion
--max-input-tokens <n>Input-token cap per call
--max-output-tokens <n>Output-token cap per call
--max-reasoning-tokens <n>Reasoning-token cap per call
--max-tool-calls <n>Hard cap on tool invocations per run
--max-model-calls <n>Hard cap on model invocations per run
--max-delegated-tasks <n>Cap on nested delegated agents
--max-session-turns <n>Cap across resumed sessions
--system-prompt <s>Replace the rendered system prompt body
--append-system-prompt <s>Append text to the rendered system prompt
--override-system-prompt <s>Skip the default system prompt entirely and use this
--response-schema-file <path>Load a structured-output schema from a JSON file
--response-schema-name <s>Name the schema (default response)
--response-schema-strictReject responses that don't match the schema
--auto-snip-threshold <n>Token threshold above which old messages are auto-snipped
--auto-compact-threshold <n>Token threshold above which the conversation is auto-compacted
--compact-preserve-messages <n>Number of recent messages preserved during a compact (default 4)
--disable-claude-mdSkip discovery of CLAUDE.md files
--add-dir <path>Additional working directory the agent may operate in (repeatable)
--no-browserDon't auto-open a browser tab

Every budget flag above is also editable at runtime through the Budgets & limits disclosure in the settings panel โ€” leave a field blank to clear the limit, type a number to set it.

The GUI surfaces:

  • multi-turn chat with tool-call cards (collapsible JSON args + results)
  • saved sessions sidebar with one-click resume
  • slash command and skill pickers (/ and โ˜… buttons, or Cmd/Ctrl+K)
  • live settings panel (model, base URL, working dir, permissions)
  • usage / cost meta in the composer footer
  • pasted-content collapsing โ€” see below
  • runtime knobs: temperature, timeout, streaming toggle, max turns
  • a Tasks tab in the topbar โ€” list / create / start / complete / cancel against .port_sessions/task_runtime.json

Paste large content

Paste anything โ‰ฅ500 characters into the composer (a logfile, a stack trace, an entire file) and the GUI replaces it with a short reference like [Pasted text #1 +42 lines], plus a chip above the textarea showing ๐Ÿ“Ž [Pasted text #1] ยท 42 lines ยท 1894 chars ยท โœ•.

  • The reference stays editable โ€” type around it, delete it, or duplicate it; whatever survives at send-time is what gets expanded.
  • The full content is held in the browser only and shipped with the next /api/chat request as pasted_contents.
  • The server re-splices the original text back in before the agent runs, so the model sees the full payload โ€” never the placeholder.
  • The chip's โœ• button drops both the content stash and any inline ref so it can't accidentally come along.
  • The stash clears after every successful send and when you click + New chat.

Note: The GUI uses FastAPI and Uvicorn under the hood. These get installed automatically if you install the package via pip install -e .. The core Python agent runtime itself remains dependency-free.


๐Ÿ”„ Session Persistence

Each agent run automatically saves a resumable session:

session_id=4f2c8c6f9c0e4d7c9c7b1b2a3d4e5f67
session_path=.port_sessions/agent/4f2c8c6f...

Resume a previous session:

python3 -m src.main agent-resume \
  4f2c8c6f9c0e4d7c9c7b1b2a3d4e5f67 \
  "Continue the previous task and finish the missing parts."

Resume directly into interactive chat:

python3 -m src.main agent-chat \
  --resume-session-id <session-id> \
  --cwd .

Inspect saved sessions:

ls -lt .port_sessions/agent

Note: Run agent-resume from the same claw-code/ directory where the session was created. A resumed session continues from the saved transcript, not from scratch.


๐Ÿงช Testing

Run the full test suite:

python3 -m unittest discover -s tests -v

Smoke tests:

python3 -m src.main agent "/help"
python3 -m src.main agent-context --cwd .
python3 -m src.main agent \
  "Read src/agent_session.py and summarize the message flow." \
  --cwd .

๐Ÿ“š Full testing guide: See TESTING_GUIDE.md for step-by-step commands covering the full implemented runtime surface.


๐Ÿ” Permission Model

Claw Code Agent uses a tiered permission system to keep the agent safe by default:

TierCapabilityFlag Required
Read-onlyList, read, glob, grepNone (default)
Write+ file creation and editing--allow-write
Shell+ shell command execution--allow-shell
Unsafe+ destructive shell operations--unsafe

๐Ÿ”Ž Parity Status

The full implementation checklist tracking parity against the npm src lives in PARITY_CHECKLIST.md.

It covers: core runtime, CLI modes, prompt assembly, context/memory, slash commands, tools, permissions, plugins, MCP, REPL/TUI, remote features, editor integrations, and internal subsystems.


โš ๏ธ Disclaimer

  • This repository is a Python reimplementation inspired by the Claude Code npm architecture.
  • It does not ship the original npm source.
  • It is not affiliated with or endorsed by Anthropic.

Built with ๐Ÿ Python ยท Powered by ๐Ÿ‰ HarnessLab Team.