How Probe Compares to Other Code Search Tools

March 8, 2026 ยท View on GitHub

Probe occupies a unique position in the code search landscape. This page explains how it differs from other approaches and when each tool is the right choice.

Code search tools generally fall into three camps:

  1. Text-based (grep, ripgrep) -- fast regex matching, no code understanding
  2. Embedding-based (grepai, Octocode) -- vector similarity search, requires indexing + embedding model
  3. AST-aware + keyword (Probe) -- structural code understanding with boolean keyword search, zero setup

Probe takes the third path. It uses tree-sitter to understand code structure (returning complete functions, classes, and structs), combined with Elasticsearch-style boolean queries and BM25 ranking. No indexing, no embedding model, no external services.

Detailed Comparison

Probe vs Embedding-Based Search (grepai, Octocode)

Tools like grepai and Octocode convert code into vector embeddings and use cosine similarity to find semantically similar code.

Their advantage: Natural language queries work without exact keyword matches. Searching "authentication flow" can find code named verify_credentials.

Why Probe doesn't need embeddings: When an AI agent uses Probe, the LLM is the semantic layer. It translates natural language into precise keyword queries:

User: "find the authentication logic"
  -> LLM generates: probe search "verify_credentials OR authenticate OR login OR auth_handler"
  -> Probe: SIMD-accelerated matching, complete AST blocks, milliseconds
Embedding toolsProbe
SetupMinutes (indexing + embedding API)Zero
Result unit~512-char text chunks (can split mid-function)Complete AST blocks (functions, classes, structs)
External depsOllama, OpenAI, or cloud embedding APINone
Search latency100ms+ (embedding + vector lookup)Milliseconds (SIMD pattern matching)
DeterminismVaries with model/index stateSame query = same results
Index maintenanceRe-index on code changes (or risk stale results)No index (always current)
Best forHuman users typing natural languageAI agents generating precise boolean queries

Probe vs Code Knowledge Graphs (Stakgraph, ABCoder)

Tools like Stakgraph and ABCoder build structural representations of codebases -- call graphs, dependency edges, type hierarchies.

Their advantage: They answer structural questions: "Who calls this function?", "What implements this interface?", "What's the shortest path between these two modules?"

Probe's advantage: Zero setup, instant search, AST-aware output optimized for LLM consumption.

Graph toolsProbe
Call graphYes (function-level edges)Planned (via LSP integration)
Dependency analysisYes (typed relationships)Not yet
Code searchLimited (node name lookup)Full-featured (boolean queries, BM25, ranking)
SetupHeavy (Neo4j, batch parsing, LSP servers)Zero
Token awarenessLimitedBuilt-in (--max-tokens, session dedup)
Real-timeRequires rebuild on changesAlways current (stateless)

These tools are complementary. Probe finds code; graph tools map relationships.

Probe vs LSP-Based Tools (Crabviz)

Crabviz uses Language Server Protocol to build interactive call graph visualizations in VS Code.

Their advantage: Works with any language that has an LSP server (~60+). Beautiful interactive SVG visualizations.

Probe's advantage: Works outside VS Code, has search capabilities, and integrates with AI agents.

CrabvizProbe
EnvironmentVS Code onlyCLI, MCP, SDK, any editor
Call graphYes (via LSP)Planned
SearchNoneFull-featured
AI integrationNoneFull agent loop + MCP
VisualizationInteractive SVG with pan/zoomText-based outline format

Probe vs grep/ripgrep

grep/ripgrepProbe
SpeedFastFast (uses ripgrep + SIMD)
Code understandingNone (text only)AST-aware (tree-sitter)
Result unitLinesComplete functions/classes
Query languageRegexElasticsearch-style boolean
RankingNone (file order)BM25, TF-IDF, Hybrid
AI integrationNoneMCP, SDK, built-in agent
Token limitsNone--max-tokens, session dedup

Probe's Design Philosophy

  1. Zero setup, instant results. No indexing, no embedding models, no databases. Clone a repo, search immediately.

  2. The LLM is the semantic layer. Instead of building an embedding index for natural language queries, Probe gives the LLM a powerful query language and lets it generate precise searches. This is faster, cheaper, and more deterministic.

  3. Code is code, not text. Every result is a complete AST block -- a full function, class, or struct. Never a broken text chunk that splits a function in half.

  4. Token-aware by design. --max-tokens enforces budgets. Session dedup prevents repeating previously returned blocks. Output formats are optimized for LLM consumption.

  5. Deterministic and reproducible. No model variance, no stale indexes, no non-deterministic similarity scores. Same query always returns the same results.

When to Use What

ScenarioBest tool
AI agent needs code context, any repo, instantlyProbe
Human searching code with natural language, doesn't know the termsEmbedding tool (grepai, Octocode)
"Who calls this function?" / "What implements this interface?"Graph tool (Stakgraph) or Probe with LSP (coming soon)
Visualize call graph of a moduleCrabviz
Give an LLM structured code context with minimal tokensProbe or ABCoder
AI-assisted git workflow (commit, review, release)Octocode
Simple text search in terminalripgrep