README.md

June 9, 2026 · View on GitHub

ChunkHound

Local-first codebase intelligence

Tests License: MIT 100% AI Generated Discord

Your AI assistant searches code but doesn't understand it. ChunkHound researches your codebase—extracting architecture, patterns, and institutional knowledge at any scale. Integrates via MCP.

Features

  • cAST Algorithm - Research-backed semantic code chunking
  • Multi-Hop Semantic Search - Discovers interconnected code relationships beyond direct matches
  • Semantic search - Natural language queries like "find authentication code"
  • Regex search - Pattern matching without API keys
  • Local-first - Your code stays on your machine
  • 32 languages with structured parsing
    • Programming (via Tree-sitter): Python, JavaScript, TypeScript, JSX, TSX, Java, Kotlin, Groovy, C, C++, C#, Go, Rust, Haskell, Swift, Bash, MATLAB, Makefile, Objective-C, PHP, Dart, Lua, Vue, Svelte, Zig
    • Configuration: JSON, YAML, TOML, HCL, Markdown
    • Text-based (custom parsers): Text files, PDF
  • MCP integration - Works with Claude, VS Code, Cursor, Windsurf, Zed, etc
  • Real-time indexing - Automatic file watching, smart diffs, seamless branch switching, and explicit backend selection (watchdog, watchman, polling)

Documentation

Visit chunkhound.ai for documentation:

Requirements

Installation

# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install ChunkHound
uv tool install chunkhound

Quick Start

  1. Create .chunkhound.json in project root
{
  "embedding": {
    "provider": "voyageai",
    "api_key": "your-voyageai-key"
  },
  "llm": {
    "provider": "claude-code-cli"
  }
}

Note: Use "codex-cli" instead if you prefer Codex. Both work equally well and require no API key.

  1. Index your codebase
chunkhound index
  1. Search changed code in recent commits
# Last N commits
chunkhound search "authentication changes" --last-n 20

# Changes introduced by that commit (diff against its parent; root commits use empty tree)
chunkhound search "database migration" --commit-hash abc1234

# Custom git range
chunkhound search "API changes" --commit-range v2.0..HEAD

# Deep research over recent changes
chunkhound research "what changed in the auth module?" --last-n 50

--vector-source controls scope: diff (default, changed code only), both (merges diff + DB), db (ignore diff).

For configuration, IDE setup, and advanced usage, see the documentation.

Why ChunkHound?

ApproachCapabilityScaleMaintenance
Keyword SearchExact matchingFastNone
Traditional RAGSemantic searchScalesRe-index files
Knowledge GraphsRelationship queriesExpensiveContinuous sync
ChunkHoundSemantic + Regex + Code ResearchAutomaticIncremental + realtime

Ideal for:

  • Large monorepos with cross-team dependencies
  • Security-sensitive codebases (local-only, no cloud)
  • Multi-language projects needing consistent search
  • Offline/air-gapped development environments

License

MIT