Claude Code Prompt Improver

June 3, 2026 · View on GitHub

Intelligent prompt optimization for Claude Code. It injects the right context at the right moment - at prompt submit, tool use, and subagent start - so Claude has what it needs before it acts. The goal is a better first output, so you spend fewer turns correcting it.

Prompt improvement here means improving the whole path from your prompt to Claude's output, not just rewriting the words you typed. Clarifying a vague prompt is one way to do that. Supplying a constraint you would otherwise have had to add by hand after a bad first attempt is another. Both land the output sooner.

Demo

What It Does

A small set of targeted nudges fire only when they apply, each one supplying context that would otherwise cost a correction round-trip:

NudgeFires whenSupplies
improveevery prompt (flagship)clarity check; asks 1-6 grounded questions only when the prompt is genuinely vague
approach-assessmenta request looks non-trivial (implement, refactor, migrate, multi-file)choose how to carry it out - subagent, heavier orchestration, or just do it - and pass a spawned subagent the context it needs
workflowa request looks like a multi-step workflowplan-first and per-stage model-routing guidance
output-readabilitythe response will be a substantial deliverablelead with the conclusion, prefer sections and tables, keep it terse
ask-user-questiona request hides a decision that is genuinely yours (a fork, a real tradeoff, missing requirements)ask via the AskUserQuestion tool with concrete options so you can think critically; research first when context is thin; default on minor or reversible choices
plan-modeevery prompt (alongside improve)assess whether the task is complex enough to warrant a plan reviewed before any code; enter plan mode if so, otherwise proceed
planentering plan modeterse, readable plan: file-path anchors, no decision history; re-read for flaws before presenting
background-execa long-running command (dev server, watcher, tail) is about to runrun it in the background, poll only the output that matters
subagent-routinga research or planning subagent startsfavor breadth over depth, return conclusions not raw dumps

Two nudges evaluate every prompt. improve checks clarity:

  • For clear prompts: proceeds immediately (zero skill overhead)
  • For vague prompts: invokes the prompt-improver skill to create a research plan, gather context, and ask 1-6 grounded questions, then proceeds with the clarification

plan-mode runs alongside it, judging whether the task is complex enough to plan before acting - it self-cancels on anything trivial.

The other seven fire only when they apply. The keyword-gated ones (workflow, approach-assessment, output-readability, ask-user-question, background-exec) lead with a condition ("If this is X... if not, ignore"), so a false fire is dismissed cheaply; the exact-gated ones (plan on plan-mode entry, subagent-routing on a research agent) only fire when the condition is already certain.

Result: Better outcomes on the first try, without back-and-forth.

v0.4.0 Update: Skill-based architecture with hook-level evaluation achieves 31% token reduction. Clear prompts have zero skill overhead, vague prompts get comprehensive research and questioning via the skill.

How It Works

sequenceDiagram
    participant User
    participant Hook
    participant Claude
    participant Skill
    participant Explore
    participant Project

    User->>Hook: "fix the bug"
    Hook->>Claude: Evaluation prompt (~189 tokens)
    Claude->>Claude: Evaluate using conversation history
    alt Vague prompt
        Claude->>Skill: Invoke prompt-improver skill
        Skill-->>Claude: Research and question guidance
        Claude->>Claude: Create research plan (TodoWrite)
        Claude->>Explore: Dispatch research (Glob, Grep, Web, multi-file Read)
        Explore->>Project: Execute search and reads
        Project-->>Explore: Raw results
        Explore-->>Claude: Synthesized findings
        Claude->>Claude: Synthesize, mine history, run git/Bash if needed
        Claude->>User: Ask grounded questions (1-6)
        User->>Claude: Answer
        Claude->>Claude: Execute original request with answers
    else Clear prompt
        Claude->>Claude: Proceed immediately (no skill load)
    end

Installation

Requirements: Claude Code 2.0.22+ (uses AskUserQuestion tool for targeted clarifying questions)

1. Add the marketplace:

claude plugin marketplace add severity1/severity1-marketplace

2. Install the plugin:

claude plugin install prompt-improver@severity1-marketplace

3. Restart Claude Code

Verify installation with /plugin command. You should see the prompt-improver plugin listed.

1. Clone the repository:

git clone https://github.com/severity1/claude-code-prompt-improver.git
cd claude-code-prompt-improver

2. Add the local marketplace:

claude plugin marketplace add /absolute/path/to/claude-code-prompt-improver/.dev-marketplace/.claude-plugin/marketplace.json

Replace /absolute/path/to/ with the actual path where you cloned the repository.

3. Install the plugin:

claude plugin install prompt-improver@local-dev

4. Restart Claude Code

Verify installation with /plugin command. You should see "1 plugin available, 1 already installed".

Option 3: Manual Installation

1. Copy the engine, rules, builtins, and nudges:

mkdir -p ~/.claude/hooks/prompt-improver/scripts
cp scripts/engine.py scripts/rules.py scripts/nudge_builtins.py ~/.claude/hooks/prompt-improver/scripts/
cp -r nudges ~/.claude/hooks/prompt-improver/nudges
chmod +x ~/.claude/hooks/prompt-improver/scripts/engine.py

The engine resolves nudges/ relative to its own location and loads it recursively (nudges/<EventName>/*.json), so copy the whole nudges/ tree and keep scripts/ and nudges/ siblings under the same parent.

2. Update ~/.claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "python3 ~/.claude/hooks/prompt-improver/scripts/engine.py UserPromptSubmit"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "EnterPlanMode|Bash",
        "hooks": [
          {
            "type": "command",
            "command": "python3 ~/.claude/hooks/prompt-improver/scripts/engine.py PreToolUse"
          }
        ]
      }
    ],
    "SubagentStart": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "python3 ~/.claude/hooks/prompt-improver/scripts/engine.py SubagentStart"
          }
        ]
      }
    ]
  }
}

Usage

Normal use:

claude "fix the bug"      # Hook evaluates, may ask questions
claude "add tests"        # Hook evaluates, may ask questions

Bypass prefixes:

claude "* add dark mode"                    # * = skip evaluation
claude "/help"                              # / = slash commands bypass
claude "# remember to use rg over grep"     # # = memorize bypass

Vague prompt:

$ claude "fix the error"

Claude asks:

Which error needs fixing?
  ○ TypeError in src/components/Map.tsx (recent change)
  ○ API timeout in src/services/osmService.ts
  ○ Other (paste error message)

You select an option, Claude proceeds with full context.

Clear prompt:

$ claude "Fix TypeError in src/components/Map.tsx line 127 where mapboxgl.Map constructor is missing container option"

Claude proceeds immediately without questions.

Design Philosophy

  • Improve the prompt-to-output path - the goal is a right first output. Clarifying a vague prompt is one way; injecting a constraint the user would otherwise add by hand after a bad attempt is another. Both count.
  • Rarely intervene - most prompts pass through unchanged; each nudge fires only when it applies.
  • Fire wide, self-cancel cheap - a missed nudge costs a full correction loop, a false fire costs a few tokens Claude ignores. That asymmetry justifies high-recall gates, so every nudge leads with a condition ("If this is X... if not, ignore") and dismisses itself when it does not fit.
  • Trust user intent - only ask when genuinely unclear; never block.
  • Max 1-6 questions - enough for complex scenarios, still focused.
  • Transparent - injected context is visible in the conversation.

Architecture

Declarative hook engine driven by a JSON nudge registry. One engine dispatches every hook event; each capability is a data row in nudges/*.json, not a separate script. Adding an inject-context nudge is a single JSON file with zero Python changes.

Engine (scripts/engine.py) - Event Dispatcher:

  • Invoked as engine.py <EventName> (one entry per event in hooks.json)
  • Reads stdin once, runs the event's rules, merges inject_context fragments by priority with a blank-line join, emits one hookSpecificOutput object
  • Exits 0 on every path - a missing/unknown event or an event with no rules is a clean no-op that never reads stdin
  • One bad rule is isolated in a try/except so it cannot suppress the others

Rules (scripts/rules.py) - Loader + Validation:

  • Loads and validates nudges/*.json; invalid rows are skipped with a stderr note (loading never raises into the engine)
  • validate_rule enforces: required id/event, known event, action XOR handler, action type legal for the event, compilable regexes, allowlisted builtin/handler names, unique id
  • Owns the event->capability matrix (v1: inject_context on UserPromptSubmit, PreToolUse, SubagentStart)
  • Regexes are compiled once per dispatched event, not at file load

Builtins (scripts/nudge_builtins.py) - Escape Hatch:

  • Two allowlist dicts referenced by string name only: HANDLERS (improve, workflow) and MATCHERS (saved_workflow_exists)
  • A rule with "handler": "improve" runs the named handler, which owns its full fragment including bypass logic; a rule with "criteria": {"builtin": "saved_workflow_exists"} runs the named matcher
  • No eval/importlib/getattr-on-path: an unknown name is a load-time skip, never an arbitrary import
  • Named nudge_builtins (not builtins) because the stdlib builtins module is loaded before any user code and would permanently shadow a local builtins.py

Nudges (nudges/*.json) - The Registry:

  • improve - checks whether a submitted prompt is clear enough to act on, and asks for clarification only when it is genuinely vague.
  • approach-assessment - when a request looks non-trivial, raises how to carry it out (a subagent, heavier orchestration, or just doing it) and reminds that a spawned subagent needs its context passed explicitly.
  • workflow - when a request looks like a multi-step workflow, suggests planning before running and routing each stage to an appropriately sized model.
  • ask-user-question - when a request hides a decision that is genuinely the user's (a fork, a real tradeoff, missing requirements), routes it through the AskUserQuestion tool with concrete options, grounds the questions in research when context is thin, and defaults on minor or reversible choices.
  • plan-mode - evaluates every prompt: judges whether the task is complex, multi-step, ambiguous, or architectural enough to warrant a plan reviewed before any code is written, and enters plan mode if so. Self-cancels on trivial work. Owns "whether to plan at all"; approach-assessment owns "which approach".
  • plan - when entering plan mode, encourages a clean, readable plan: terse steps, file-path anchors, no decision history, and a re-read for flaws before presenting.
  • subagent-routing - when a research or planning subagent starts, encourages breadth over depth and lean, conclusion-first reporting.
  • background-exec - when a long-running command (dev server, watcher, tail) is about to run, suggests running it in the background and polling only the output that matters.
  • output-readability - when the response will be a substantial deliverable (report, review, summary, analysis), encourages a human-readable result: lead with the conclusion, prefer sections and tables, keep it terse.

Known limitation (workflow nudge): the keyword filter biases toward recall, so a non-launch mention of "workflow" (e.g. "fix the CI workflow file") may inject inert guidance. The guidance leads with a conditional guard ("If this prompt will run as a dynamic workflow... if not, ignore") so the model self-cancels false positives. No hook can see the post-prompt workflow decision.

Skill (skills/prompt-improver/) - Research & Question Logic:

  • SKILL.md: Research and question workflow (~170 lines)
    • Assumes prompt already determined vague by the engine's improve nudge
    • 4-phase process: Research → Questions → Clarify → Execute
    • Links to reference files for progressive disclosure
  • references/: Detailed guides loaded on-demand
    • question-patterns.md: Question templates (200-300 lines)
    • research-strategies.md: Context gathering (300-400 lines)
    • examples.md: Real transformations (200-300 lines)

How to Add a Nudge

A new inject-context nudge is a single JSON file under nudges/ - no Python changes.

1. Drop a file in the event's subdirectory. Nudges live in nudges/<EventName>/, one folder per hook event, and files are named <priority>-<id>.json so each folder sorts in merge order:

nudges/
  UserPromptSubmit/
    00-improve.json
    10-approach-assessment.json
    20-workflow.json
    30-output-readability.json
    40-ask-user-question.json
    50-plan-mode.json
  PreToolUse/
    00-plan.json
    10-background-exec.json
  SubagentStart/
    00-subagent-routing.json

For example, nudges/UserPromptSubmit/20-kubernetes.json:

{
  "id": "kubernetes",
  "event": "UserPromptSubmit",
  "description": "What this nudge does and why (never emitted - JSON has no comments).",
  "criteria": {
    "match": ["\\bkubernetes\\b"],
    "flags": ["ignorecase"],
    "non_slash": true
  },
  "action": {
    "type": "inject_context",
    "text": [
      "If this task touches Kubernetes manifests: validate with kubeconform before applying.",
      "If not, ignore this guidance."
    ]
  },
  "priority": 20
}

The parent directory is authoritative: a rule whose event field does not match its folder is skipped with a stderr note (so a file in PreToolUse/ cannot claim "event": "UserPromptSubmit"). The filename's priority prefix is cosmetic - the engine sorts fragments by the JSON priority field, not the filename, so keep the two in sync. Priority is event-scoped: it only orders rules that share an event (e.g. improve at 0 and workflow at 20 both merge on UserPromptSubmit; plan at 0 never competes with them). Pad the prefix to match your widest priority (2 digits is safe for the gap convention 0/10/20/.../50; a priority >= 100 sorts wrong against 2-digit names).

2. That's it. The engine recursively auto-loads every nudges/**/*.json on the next prompt. No hooks.json edit is needed unless the nudge targets a new event (each event needs one dispatch entry in hooks.json and one new nudges/<EventName>/ folder).

Schema reference:

FieldRequiredMeaning
idyesUnique rule id
eventyesUserPromptSubmit, PreToolUse, or SubagentStart
descriptionnoIntent note, never emitted
actionone of{ "type": "inject_context", "text": [lines], "append_when": [{ "match": [regex], "text": [lines] }] }
handlerone ofString naming a callable in nudge_builtins.HANDLERS (escape hatch)
criterianomatch/exclude (regex arrays), match_target (prompt|tool_name|agent_type|command; command reads nested tool_input.command), non_slash, flags, builtin
bypassnodefault (suppress on */#/empty for prompt targets) or none
prioritynoMerge order (lower first); default 100

Provide exactly one of action or handler. text is an array of lines joined with newlines at load (clean multiline diffs). append_when models a conditional clause declaratively (used by the ultracode clause). Invalid rows are skipped with a stderr note; the engine still exits 0.

Flow for Clear Prompts:

  1. Hook wraps with evaluation prompt (~189 tokens)
  2. Claude evaluates: prompt is clear
  3. Claude proceeds immediately (no skill invocation)
  4. Total overhead: ~189 tokens

Flow for Vague Prompts:

  1. Hook wraps with evaluation prompt (~189 tokens)
  2. Claude evaluates: prompt is vague
  3. Claude invokes prompt-improver skill
  4. Skill loads research/question guidance
  5. Claude creates research plan, gathers context, asks questions
  6. Total overhead: ~189 tokens + skill load

Progressive Disclosure Benefits:

  • Clear prompts: Never load skill (zero skill overhead)
  • Vague prompts: Only load skill and relevant reference files
  • Detailed guidance available without bloating all prompts
  • Zero context penalty for unused reference materials

Research dispatch model:

  • Glob, Grep, WebSearch, WebFetch, and multi-file Read route through Task/Explore (Haiku-based, separate context window)
  • Main context handles history mining, single-file Reads of user-named files, Bash/git commands, synthesis, and questions
  • Bash stays in main context because Explore agents cannot run shell commands
  • Every Explore dispatch carries explicit conversation context (file paths, errors, prior decisions) since Explore has no access to prior turns
  • Net effect: search noise lives in cheap subagent tokens, decisions live in main-context tokens

Manual Skill Invocation: You can also invoke the skill manually without the hook:

Use the prompt-improver skill to research and clarify: "add authentication"

Token Overhead

v0.4.0 Update: 31% reduction through hook-level evaluation

  • Per prompt (v0.4.0): ~189 tokens (evaluation prompt)
  • Per prompt (v0.3.x): ~275 tokens (embedded evaluation logic)
  • Reduction: ~86 tokens saved per prompt (31% decrease)
  • 30-message session: ~5.7k tokens (~2.8% of 200k context, down from 4.1%)
  • Trade-off: Minimal overhead for better first-attempt results

Clear prompts benefit:

  • Evaluation happens in hook (~189 tokens)
  • Claude proceeds immediately (no skill load)
  • Zero skill overhead for clear prompts

Vague prompts:

  • Evaluation in hook (~189 tokens)
  • Skill loads only when needed for research/questions
  • Progressive disclosure: reference files load on-demand

FAQ

Does this work on all prompts? Yes, unless you use bypass prefixes (*, /, #).

Will it slow me down? Only slightly when it asks questions. Faster overall due to better context.

Will I get bombarded with questions? No. It rarely intervenes, passes through most prompts, and asks max 1-6 questions.

Can I customize behavior? It adapts automatically using conversation history, dynamic research planning, and CLAUDE.md.

What if I don't want improvement? Use * prefix: claude "* your prompt here"

License

MIT