πΊοΈ CodeMap
January 31, 2026 Β· View on GitHub
πΊοΈ CodeMap
A lightweight index that makes LLM code exploration cheaper β not smarter.
CodeMap does not try to understand your code, infer architecture, or decide what's relevant. That job belongs to the LLM.
CodeMap exists for one reason:
To make each step of an LLM's reasoning over a codebase cost fewer tokens.
Quick Start β’ How It Works β’ Commands β’ Claude Plugin β’ Comparison

The Problem
LLMs explore codebases iteratively. They:
- Think about what they need
- Read some code
- Think again
- Read more code
- Repeat
The problem is that reading code is expensive.
Without help, an LLM often has to:
- Read entire files
- Re-read the same files after context resets
- Pull in large chunks "just in case"
This quickly leads to massive token usageβeven when the LLM only needed a small part of each file.
The Insight
LLMs don't need less reasoning. They need cheaper reads.
If you make each "read code" step smaller and more precise, the same reasoning process becomes dramatically cheaper.
The bottleneck is not intelligence β it's I/O cost.
That's what CodeMap fixes.
What CodeMap Is (and Is Not)
β What CodeMap is
- A structural index of your codebase
- A fast way to locate symbols and their exact line ranges
- A tool that lets an LLM jump directly to relevant snippets
- A cost-reduction layer for iterative LLM reasoning
β What CodeMap is not
- Not a semantic analyzer
- Not an architecture inference engine
- Not a replacement for LSPs
- Not an agent
- Not "smart"
CodeMap does not decide what code matters. It only makes it cheaper to read the code the LLM decides to look at.
How This Changes LLM Code Exploration
Without CodeMap
LLM thinks
β reads 5 full files (~30K tokens)
β thinks
β reads 3 more full files (~18K tokens)
Total: ~48K tokens
With CodeMap
LLM thinks
β queries symbols β reads 5 targeted snippets (~5K tokens)
β thinks
β queries again β reads 3 more snippets (~3K tokens)
Total: ~8K tokens
Same reasoning. Same conclusions. ~83% fewer tokens.
The LLM can always escalate: snippet β larger slice β full file. CodeMap never blocks accessβit just makes precision cheap.
π Measured Impact
The savings compound across a session:
| Scenario | Without CodeMap | With CodeMap | Savings |
|---|---|---|---|
| Single class lookup | 1,700 tokens | 1,000 tokens | 41% |
| 10-file refactor | 51,000 tokens | 11,600 tokens | 77% |
| 50-turn coding session | 70,000 tokens | 21,000 tokens | 70% |
It's not about any single lookup. It's about making every lookup cheaper and letting those savings multiply.
β‘ Quick Start
Install via pip or uv
pip install git+https://github.com/AZidan/codemap.git
uv tool install codemap --from https://github.com/AZidan/codemap.git
Use
codemap init .
codemap watch . & # Keep index updated in background
codemap find "ClassName"
# β src/file.py:15-89 [class] ClassName
# Now the LLM reads only lines 15-89 instead of the entire file
How It Works
- CodeMap scans your repository and builds a symbol index
- Each symbol is mapped to:
- File path
- Start line / end line
- Type (function, class, method, etc.)
- Signature and docstring (optional)
- The index is stored locally under
.codemap/ - An LLM (or human) can:
- Search for symbols by name
- Read only the exact lines needed
- Check if files changed without re-reading them
- Repeat as part of its reasoning loop
No embeddings. No inference. No opinions.
Commands
codemap init [PATH]
Build the index for a directory.
codemap init # Index current directory
codemap init ./src # Index specific directory
codemap init -l python # Only Python files
codemap init -e "**/tests/**" # Exclude patterns
codemap find QUERY
Find symbols by name (case-insensitive substring match).
codemap find "UserService" # Find by name
codemap find "process" --type method # Filter by type
codemap find "handle" --type function # Functions only
Output:
src/services/user.py:15-89 [class] UserService
src/services/user.py:20-45 [method] process_request
Fuzzy Search
Use --fuzzy (-f) for broader matching when exact/substring search isn't enough. Fuzzy search adds:
- Word-level matching β splits on spaces, hyphens, and underscores
- Filename matching β searches file names in addition to symbols
- Docstring matching β searches symbol documentation
- Typo tolerance β finds close matches using similarity scoring
Results are ranked by match quality (exact > substring > word overlap > fuzzy similarity).
codemap find "user service" --fuzzy # Word-level match
codemap find "pricng" --fuzzy # Typo tolerance
codemap find "monetization" --fuzzy # Search docstrings
codemap show FILE
Display file structure with symbols and line ranges.
codemap show src/services/user.py
Output:
File: src/services/user.py (hash: a3f2b8c1d4e5)
Lines: 542
Language: python
Symbols:
- UserService [class] L15-189
(self, config: Config)
# Handles user operations
- __init__ [method] L20-35
- get_user [method] L37-98
(self, user_id: int) -> User
- create_user [async_method] L100-145
(self, data: dict) -> User
codemap validate [FILE]
Check if indexed files have changedβwithout re-reading them.
codemap validate # Check all files
codemap validate src/main.py # Check specific file
Output:
Stale entries (2):
- src/utils/helpers.py
- src/models/user.py
Run 'codemap update --all' to refresh
This is where hash-based staleness detection saves tokens. The LLM can check if a file changed without paying to read it again.
codemap update [FILE] [--all]
Update the index for changed files.
codemap update src/main.py # Update single file
codemap update --all # Update all stale files
codemap watch [PATH]
Watch for file changes and update index in real-time.
codemap watch # Watch current directory
codemap watch ./src # Watch specific directory
codemap watch -d 1.0 # 1 second debounce
codemap watch -q # Quiet mode
Output:
Watching /path/to/project for changes...
Press Ctrl+C to stop
[14:30:15] Updated main.py (2 symbols changed)
[14:30:22] Updated utils.py
[14:31:05] Added new_module.py (3 symbols)
codemap stats
Show statistics about the index.
codemap stats
Output:
CodeMap Statistics
========================================
Root: /path/to/project
Total files: 47
Total symbols: 382
Files by language:
python: 35
typescript: 10
javascript: 2
Symbols by type:
method: 245
function: 67
class: 42
async_method: 13
codemap install-hooks
Install git pre-commit hook for automatic updates.
codemap install-hooks
π Claude Code Plugin
The plugin teaches Claude Code to use CodeMap automatically.
Installation
# Add the marketplace
claude plugin marketplace add AZidan/codemap
# Install the plugin
claude plugin install codemap
What Changes
Once installed, Claude will:
- Use
codemap findto locate symbols instead of scanning files - Read only the relevant line ranges instead of full files
- Use
codemap validateto check staleness before re-reading - Auto-install the CLI if not present
The LLM's reasoning doesn't changeβeach step just gets cheaper.
Manual Skill Installation
# Copy skill to your project
cp -r .claude/skills/codemap /path/to/your/project/.claude/skills/
See plugin/README.md for detailed documentation.
Installation
Claude Code (Recommended)
claude plugin marketplace add AZidan/codemap
claude plugin install codemap
pip Install
# Basic (Python only)
pip install git+https://github.com/AZidan/codemap.git
# With TypeScript/JavaScript support
pip install "codemap[treesitter] @ git+https://github.com/AZidan/codemap.git"
# All languages + watch mode
pip install "codemap[all] @ git+https://github.com/AZidan/codemap.git"
uv Install
# Basic (Python only)
uv tool install codemap --from https://github.com/AZidan/codemap.git
# With TypeScript/JavaScript support
uv tool install codemap --from https://github.com/AZidan/codemap.git --with codemap[treesitter]
# All languages + watch mode
uv tool install codemap --from https://github.com/AZidan/codemap.git --with codemap[all]
From Source
git clone https://github.com/azidan/codemap.git
cd codemap
pip install -e ".[all]"
Supported Languages
| Language | Parser | Install | Symbol Types |
|---|---|---|---|
| Python | stdlib ast | (included) | class, function, method, async_function, async_method |
| TypeScript | tree-sitter | see below | class, function, method, interface, type, enum |
| JavaScript | tree-sitter | see below | class, function, method, async_function, async_method |
| Kotlin | tree-sitter | see below | class, interface, function, method, object |
| Swift | tree-sitter | see below | class, struct, protocol, enum, function, method |
| PHP | tree-sitter | see below | class, interface, trait, enum, function, method |
| Go | tree-sitter | see below | function, method, struct, interface, type |
| Java | tree-sitter | see below | class, interface, enum, method |
| C# | tree-sitter | see below | class, interface, struct, enum, method, property |
| Rust | tree-sitter | see below | function, struct, enum, trait, impl, module |
| C | tree-sitter | see below | function, struct, enum, typedef |
| C++ | tree-sitter | see below | class, struct, function, method, namespace, enum, template |
| HTML | tree-sitter | see below | element (semantic), id |
| CSS | tree-sitter | see below | selector (class, id, element), media, keyframe |
| Markdown | regex | (included) | section (H2), subsection (H3), subsubsection (H4) |
| YAML | pyyaml | (included) | key, section, list |
# Install with specific language support
pip install "codemap[treesitter] @ git+https://github.com/AZidan/codemap.git" # TS/JS
pip install "codemap[kotlin] @ git+https://github.com/AZidan/codemap.git" # Kotlin
pip install "codemap[swift] @ git+https://github.com/AZidan/codemap.git" # Swift
pip install "codemap[php] @ git+https://github.com/AZidan/codemap.git" # PHP
pip install "codemap[go] @ git+https://github.com/AZidan/codemap.git" # Go
pip install "codemap[java] @ git+https://github.com/AZidan/codemap.git" # Java
pip install "codemap[csharp] @ git+https://github.com/AZidan/codemap.git" # C#
pip install "codemap[rust] @ git+https://github.com/AZidan/codemap.git" # Rust
pip install "codemap[c] @ git+https://github.com/AZidan/codemap.git" # C
pip install "codemap[cpp] @ git+https://github.com/AZidan/codemap.git" # C++
pip install "codemap[html] @ git+https://github.com/AZidan/codemap.git" # HTML
pip install "codemap[css] @ git+https://github.com/AZidan/codemap.git" # CSS
# Install all languages
pip install "codemap[languages] @ git+https://github.com/AZidan/codemap.git"
Language support is intentionally modular and extensible.
Configuration
Automatic .gitignore Support
CodeMap automatically respects your .gitignore file. Patterns from .gitignore are applied during indexing, so directories like node_modules/, .venv/, and dist/ are excluded without any configuration.
Custom Configuration
Create a .codemaprc file in your project root for additional options:
# Languages to index
languages:
- python
- typescript
- javascript
- php
# Additional patterns to exclude (on top of .gitignore)
exclude:
- "**/migrations/**"
- "**/fixtures/**"
# Patterns to include (optional)
include:
- "src/**"
- "lib/**"
# Disable .gitignore support if needed (default: true)
respect_gitignore: false
# Truncate long docstrings
max_docstring_length: 150
# Output directory (default: .codemap)
output: .codemap
Output Format
Directory Structure
CodeMap uses distributed per-directory indexes for scalability:
project/
βββ .codemap/
β βββ .codemap.json # Root manifest
β βββ _root.codemap.json # Files in project root
β βββ src/
β β βββ .codemap.json # Files in src/
β β βββ components/
β β βββ .codemap.json # Files in src/components/
β βββ tests/
β βββ .codemap.json
βββ src/
β βββ ...
βββ tests/
βββ ...
Index Format
Each .codemap.json contains:
{
"version": "1.0",
"generated_at": "2025-01-12T10:30:00Z",
"directory": "src",
"files": {
"main.py": {
"hash": "a3f2b8c1d4e5",
"indexed_at": "2025-01-12T10:30:00Z",
"language": "python",
"lines": 150,
"symbols": [
{
"name": "UserService",
"type": "class",
"lines": [10, 150],
"docstring": "Handles user operations",
"children": [
{
"name": "get_user",
"type": "method",
"lines": [25, 50],
"signature": "(self, user_id: int) -> User"
}
]
}
]
}
}
}
When CodeMap Is a Good Fit
- Large repositories where context limits matter
- Long coding sessions where savings compound
- Refactoring tasks that touch many files
- Token-sensitive workflows where API costs matter
- 200K context models where every token counts
When CodeMap Is Not the Right Tool
- Small projects that fit entirely in context anyway
- Deep semantic analysis β use LSP tools instead
- Architecture inference β CodeMap doesn't infer anything
- 1M token contexts where limits rarely matter
CodeMap is deliberately simple.
Comparison with Alternatives
| Feature | CodeMap | Aider RepoMap | Serena | RepoPrompt |
|---|---|---|---|---|
| Approach | Lookup index | Summarization | Semantic (LSP) | Context building |
| Who decides relevance | LLM | Tool (PageRank) | Tool | Tool |
| Token cost model | Per-lookup | Upfront | Per-query | Upfront |
| Line-range precision | β Exact | β Approximate | β Full symbols | β Full files |
| Hash-based staleness | β | β | β | β |
| Watch mode | β | β | β | β |
| Setup complexity | Low | Medium | High | Low |
The key difference: other tools try to predict what context matters. CodeMap lets the LLM decide, and just makes each decision cheaper to act on.
Design Philosophy
Do one thing. Do it well. Stay dumb.
CodeMap is intentionally:
- Deterministic β same query, same results
- Transparent β just file paths and line numbers
- Predictable β no inference, no surprises
It is a primitiveβnot a framework.
Development
# Clone the repo
git clone https://github.com/azidan/codemap.git
cd codemap
# Create virtual environment
python -m venv .venv
source .venv/bin/activate
# Install with dev dependencies
pip install -e ".[all]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=codemap
# Format code
black codemap
ruff check codemap
Project Structure
codemap/
βββ cli.py # Click CLI commands
βββ core/
β βββ indexer.py # Main indexing orchestrator
β βββ hasher.py # SHA256 file hashing
β βββ map_store.py # Distributed JSON storage
β βββ watcher.py # File system watcher
βββ parsers/
β βββ base.py # Abstract parser interface
β βββ treesitter_base.py # Base for tree-sitter parsers
β βββ python_parser.py # Python AST parser (stdlib)
β βββ typescript_parser.py
β βββ javascript_parser.py
β βββ kotlin_parser.py # Kotlin tree-sitter parser
β βββ swift_parser.py # Swift tree-sitter parser
β βββ php_parser.py # PHP tree-sitter parser
β βββ go_parser.py
β βββ java_parser.py
β βββ csharp_parser.py
β βββ rust_parser.py
β βββ c_parser.py # C tree-sitter parser
β βββ cpp_parser.py # C++ tree-sitter parser
β βββ html_parser.py # HTML tree-sitter parser
β βββ css_parser.py # CSS tree-sitter parser
β βββ markdown_parser.py # Markdown regex parser
β βββ yaml_parser.py # YAML parser
βββ hooks/
β βββ installer.py # Git hook installation
βββ utils/
βββ config.py # Configuration management
βββ file_utils.py # File discovery utilities
π€ Contributing
Contributions welcome! Areas where help is needed:
- New language parsers β Ruby, PHP, Scala
- MCP server mode β For non-Claude tools
- Fuzzy symbol search β
codemap find "usr srv"βUserService - VSCode extension β GUI for non-CLI users
See CONTRIBUTING.md for guidelines.
π¬ Community & Support
- π Bug reports: GitHub Issues
- π‘ Feature requests: GitHub Issues
- π¬ Questions: GitHub Discussions
- β Like it? Star the repo!
License
MIT License β see LICENSE for details.
Acknowledgments
- Inspired by Aider's RepoMap concept
- Built with Click for CLI
- Uses tree-sitter for multi-language parsing
CodeMap: Because the bottleneck is I/O cost, not intelligence.