README.md

May 1, 2026 · View on GitHub

Bug Hunter — AI-powered adversarial code review and security vulnerability scanner for Claude Code, Cursor, Codex, Copilot, Windsurf, and Kiro

Bug Hunter

AI code review that argues with itself — adversarial multi-agent bug finding, security scanning, and auto-fix for any coding agent.

npm version MIT License 113 tests passing Node.js >= 18

Install · Quick Start · How It Works · What It Finds · Auto-Fix · CLI Reference


Bug Hunter is an open-source AI code review and security vulnerability scanner that works as a skill/plugin for AI coding agents. Three AI agents — a Hunter, a Skeptic, and a Referee — independently analyze your code in an adversarial pipeline. The Hunter finds bugs. The Skeptic tries to disprove them. The Referee delivers the final verdict. Only bugs that survive all three stages make the report, eliminating the false positive overload that plagues other AI code review tools.

It then auto-fixes confirmed bugs with a safe canary rollout pipeline — git branching, test baselines, per-fix commits, automatic rollback on failure, and post-fix re-scanning.


Install

npx skills add codexstar69/bug-hunter

Or via npm:

npm install -g @codexstar/bug-hunter
bug-hunter install     # auto-detects your IDE/agent
bug-hunter doctor      # verify environment

Or clone:

git clone https://github.com/codexstar69/bug-hunter.git ~/.agents/skills/bug-hunter

Requirements: Node.js 18+ recommended. Core pipeline works without it.

Compatible with: Claude Code, Cursor, Codex CLI, Windsurf, Kiro, Copilot, Opencode, Pi — or any AI agent that can read files and run shell commands.


Bug Hunter PR review workflow — pull request scope analysis, security checks, threat-model context, and final verdict

Quick Start

/bug-hunter                      # scan project, auto-fix confirmed bugs
/bug-hunter src/                 # scan a specific directory
/bug-hunter --scan-only src/     # report only, no code changes
/bug-hunter --pr                 # review the current pull request
/bug-hunter --pr-security        # PR security review + threat model + CVEs
/bug-hunter --deps --threat-model # full security audit

How Adversarial AI Code Review Works

Bug Hunter adversarial AI code review pipeline — triage, recon, deep scan, skeptic challenge, referee verdict, auto-fix with canary rollout

Triage  → Recon → Hunter → Skeptic → Referee → Fix Plan → Fixer → Verify
  (<2s)              ↕ doc verify    ↕ doc verify
  1. Triage — classifies every file by risk in <2 seconds, zero AI tokens
  2. Recon — maps tech stack, auth mechanisms, attack surfaces
  3. Hunter — deep behavioral scan for logic errors, security vulnerabilities, race conditions
  4. Skeptic — adversarial review that tries to disprove every finding with counter-evidence
  5. Referee — independent final judge, re-reads code, delivers CVSS-scored verdicts
  6. Fixer — applies canary-first patches with per-fix rollback safety

Claims are verified against official library documentation via Context Hub + Context7 before any agent asserts framework behavior.

Bug Hunter documentation verification — agents check official library docs via Context Hub and Context7 API before making claims about framework behavior

Adversarial AI code review debate — Hunter finds bugs, Skeptic challenges with counter-evidence, Referee delivers independent verdict

AgentRewarded ForPenalized For
HunterConfirmed real bugsFalse positives
SkepticDisproving false positivesMissing real bugs (2x penalty)
RefereeAccurate independent verdictsBlind trust in either side

Bugs and Vulnerabilities Detected

Runtime behavioral bugs only — not style, naming, or TODOs:

  • Security — SQL injection, XSS, command injection, path traversal, IDOR, auth bypass, SSRF
  • Logic — wrong comparisons, off-by-one, inverted conditions, unreachable branches
  • Concurrency — race conditions, TOCTOU, deadlocks
  • Error handling — swallowed exceptions, unhandled promise rejections
  • Data integrity — silent truncation, encoding corruption, resource leaks
  • API contracts — type mismatches, incorrect callback signatures

Every security finding gets STRIDE classification, CWE ID, and CVSS 3.1 scoring with proof-of-concept payloads.

Bug Hunter security finding card — bug ID, severity badge, STRIDE and CWE classification, CVSS 3.1 score, reachability rating, and proof of concept payload

Supported Languages and Frameworks

Languages: TypeScript, JavaScript, Python, Go, Rust, Java, Kotlin, Ruby, PHP

Frameworks: Express, Next.js, Django, Flask, FastAPI, Gin, Spring Boot, Rails, Laravel — and any framework with docs in Context7.


Safe Auto-Fix with Canary Rollout

Bug Hunter strategic fix planning — confidence gating, canary rollout, per-fix verification, and automatic rollback safety

When bugs are confirmed, the Fixer engineers patches — not just edits:

  1. Git branch — dedicated fix branch with restore point
  2. Test baseline — captures passing tests before any edits
  3. Strategy — classifies each bug: safe-autofix / manual-review / larger-refactor / architectural
  4. Confidence gate — only auto-fixes bugs confirmed at >=75% confidence
  5. Canary rollout — top 1-3 Critical bugs fixed first; if tests break, pipeline halts
  6. Per-fix checkpoint — each fix committed individually; failures auto-revert
  7. Post-fix re-scan — catches bugs the Fixer itself introduced

--plan-only to see the strategy. --dry-run to preview diffs. --safe to approve each fix.


Security Scanning Features

Bug Hunter bundled security pack — commit security scan, enterprise security review, STRIDE threat model generation, and vulnerability validation

FlagCapability
--threat-modelSTRIDE threat model generation
--depsDependency CVE scanning (npm, pip, go, cargo, bun) with reachability analysis
--pr-securityPR-scoped security review with threat model + CVE context
--security-reviewEnterprise security audit workflow
--validate-securityExploitability validation with CVSS scoring

Bundled security skills: commit-security-scan, security-review, threat-model-generation, vulnerability-validation.


CLI Flags

FlagBehavior
(no flags)Scan + auto-fix confirmed bugs
src/ or file.tsScan specific path
--scan-only / --reviewReport only, no edits
--fix --approve / --safeApprove each fix interactively
--plan-only / --planGenerate fix strategy without editing
--dry-run / --previewPreview fixes as diffs
-b branchScan branch diff vs main
--pr / --pr 123 / --pr recentReview a pull request
--stagedScan staged files (pre-commit hook)
--depsDependency CVE scan
--threat-modelSTRIDE threat model
--no-loopSingle-pass scan (loop is on by default)
--autonomousZero-intervention auto-fix

All flags compose: /bug-hunter --deps --threat-model --fix src/


Output Files

Bug Hunter machine-readable output — findings JSON, skeptic challenges, referee verdicts, fix strategy, fix plan, and CI/CD automation

Every run creates .bug-hunter/ (add to .gitignore):

FilePurpose
findings.jsonMachine-readable findings (CI/CD gating, dashboards)
report.mdHuman-readable report
referee.jsonFinal verdicts with CVSS scores and PoC payloads
fix-strategy.jsonRemediation classification
fix-plan.jsonCanary rollout execution plan
fix-report.jsonFix results
triage.jsonFile risk classification
threat-model.mdSTRIDE threat model
dep-findings.jsonDependency CVE results

Self-Test

Ships with 6 planted bugs and 113 regression tests:

/bug-hunter test-fixture/     # validate pipeline
npm test                      # run test suite

Project Structure

bug-hunter/
├── SKILL.md              # Pipeline orchestration
├── bin/bug-hunter         # CLI installer
├── skills/                # 10 agent skills (hunter, skeptic, referee, fixer, recon, + 5 more)
├── modes/                 # Execution strategies (single-file → large-codebase)
├── schemas/               # JSON artifact contracts
├── scripts/               # Node.js helpers + 113 tests
├── templates/             # Subagent dispatch template
└── test-fixture/          # 6 planted bugs for validation

License

MIT