CLAUDE.md
April 1, 2026 · View on GitHub
This file provides guidance to Claude Code (claude.ai/code) when working with this tree-sitter parser for the AL (Application Language) programming language used in Microsoft Dynamics 365 Business Central.
Current Status: 100% production file success rate (15,358/15,358 files), 1404 tests passing, 0 errors
Git Commit Guidelines
Always include error count in commit messages to detect regressions:
Fix XYZ pattern
[BC.History: 7 errors, 99.95% success]
Run full parse before committing: ./parse-al-parallel.sh ./BC.History/ .
Quick Reference
Essential Commands:
# Validation (run before completing any task)
./validate-grammar.sh # Quick: generation, tests, orphan/duplicate detection
./validate-grammar.sh --full # Full: includes production AL file parsing
# Standard development cycle
tree-sitter generate # Generate parser from grammar.js
tree-sitter test # Run test suite
tree-sitter test -u # Update test expectations (only if no ERRORs)
tree-sitter parse file.al -d > debug.log 2>&1 # Debug specific files
python parse_bug_finder.py file.al debug.log # Analyze parsing bugs
Common Test Options:
-i "pattern"- Include tests matching pattern-e "pattern"- Exclude tests matching pattern--file-name "test.txt"- Run specific test file-d- Show debug log-D- Generate debug graphs (log.html)
Architecture
Core Files:
grammar.js- Main grammar definition (~3,400 lines). Never editsrc/parser.c(auto-generated)src/scanner.c- External scanner for property disambiguation and preprocessor patternstest/corpus/- Test suite with AL code and expected parse trees (1,404 tests)queries/- 5 query files (highlights, locals, tags, indents, folds)
Key Design Principles (V2 architecture):
- Parse structure, don't validate — Accept any
Name = Value ;as a property. Semantic validation belongs in linters/LSP servers, not the parser - Scanner-based property disambiguation — The
PROPERTY_NAMEscanner token distinguishesidentifier =(property) fromidentifier :(variable) via 1-char lookahead - Generic property rule — ONE
propertyrule handles all simple properties (vs V1's 291 individual rules) - Generic preprocessor — ONE
preproc_conditionalrule + ~12 dedicated split-construct rules (vs V1's 63) - Named keyword nodes — 82 keywords exposed as named nodes for query matching (including
begin_keyword,end_keyword) - Stateful scanner — 1-byte depth counter tracks
#if/#endifnesting;begin/endnamed at depth 0, anonymous at depth > 0
Scanner Tokens:
| Token | Purpose |
|---|---|
PROPERTY_NAME | identifier followed by = (not :=) — property/variable disambiguation |
CONTINUE_AS_IDENTIFIER | continue followed by := — used as variable name |
PREPROC_OPEN | #if — increments depth counter |
PREPROC_CLOSE | #endif — decrements depth counter |
BEGIN_KEYWORD | begin at depth 0 — named node for queries |
END_KEYWORD | end at depth 0 — named node for queries |
PREPROC_SPLIT_BEGIN | begin at depth > 0, immediately before #endif — split detection |
PREPROC_SPLIT_END | end at depth > 0, followed by ; then #else/#endif — split detection |
Property Handling
Properties use a generic rule — no per-property validation:
property: $ => seq(
field('name', $.property_name), // PROPERTY_NAME scanner token
'=',
field('value', $._property_value),
';'
),
Complex properties (~36 rules) have unique syntax and remain as individual rules:
- CalcFormula, TableRelation, Permissions, AccessByPermission
- DataItemLink, RunPageLink, SubPageLink, ColumnFilter
- SourceTableView (and related view properties)
- Caption/ToolTip (with Locked/Comment sub-fields)
- ML properties (multilingual key=value lists)
- List properties (comma-separated identifiers)
- DecimalPlaces, OrderBy, Implementation
Adding new property support: Most properties work automatically via the generic rule. Only add a dedicated rule if the property has syntax beyond Name = Expression ;.
Keyword Architecture
80 keywords are named rules for query matching:
if_keyword: $ => kw('if'),
table_keyword: $ => kw('table'),
procedure_keyword: $ => kw('procedure'),
begin/end are named via stateful scanner — begin_keyword and end_keyword are emitted at depth 0 (outside #if blocks). At depth > 0, the scanner declines and anonymous kw('begin')/kw('end') tokens handle preprocessor-split contexts. Direct naming via grammar rules or alias() still breaks GLR backtracking — the stateful scanner is the correct approach.
Named keyword node structure — Named keyword rules (e.g., exit_keyword) wrap anonymous string children ("exit"). When tree-walking into children, you hit the anonymous string — this is expected tree-sitter behavior, not a grammar bug. The named node is the parent; the literal text is an anonymous child.
CamelCase keywords use explicit case alternatives:
controladdin_keyword: $ => prec(10, choice('controladdin', 'CONTROLADDIN', 'Controladdin', 'ControlAddIn')),
Attribute Handling
Attributes are first-class statements (Rust/C# pattern) — siblings to declarations, not nested.
[Scope('OnPrem')]
[IntegrationEvent(false, false)]
procedure MyEvent() begin end;
Parse tree: (attribute_item ...) (procedure ...) — separate nodes at the same level.
Preprocessor Handling
Generic conditionals (most cases):
preproc_conditional: $ => seq($.preproc_if, repeat($._any_content), ...)
Dedicated split-construct rules (~12, for cross-branch fragments):
preproc_split_procedure— procedure header variants in#if/#elsepreproc_split_if_statement— if-then header varies across branchespreproc_split_if_then_begin—begininside#if,endin second#ifpreproc_fragmented_else_tail— end-else-begin fragmented across#ifblockspreproc_split_declaration— object declaration split across branches- And others for case statements, fields, datasets, etc.
Testing
Test Format (test/corpus/*.txt):
========================================================================
Test Description
========================================================================
[AL source code]
------------------------------------------------------------------------
(expected_parse_tree)
Guidelines:
- Never delete test files — fix the underlying issue
- Use
tree-sitter test -uonly if no ERROR/MISSING nodes exist - Create tests for each new grammar feature
- BC.History (15,358 production files) is the real validation gate — tests are a development aid
Debugging Parse Failures
# 1. Parse with debug output
tree-sitter parse file.al -d > debug.log 2>&1
# 2. Analyze with bug finder
python parse_bug_finder.py file.al debug.log
Available tools:
parse_bug_finder.py— Correlates bugs with source code (recommended)parse_debug_analyzer.py— Full parse flow analysis (advanced)
Grammar Development
Core Principles
- Parse structure, don't validate — Accept syntactically plausible code
- snake_case for rule names
kw('word')for case-insensitive keywords (regex-based)- Use
prec.left/right/precfor precedence; avoid left recursion
Adding New Constructs
- Study AL construct (use Business Central docs MCP)
- Check for existing patterns in grammar.js
- Add/modify rules (update
src/scanner.cif needed) - Create tests
- Run
./validate-grammar.sh - Validate against BC.History
Common Issues
| Pattern | Symptom | Fix |
|---|---|---|
| Missing construct | ERROR nodes | Add rule to _body_element or relevant choice list |
| Case-sensitivity | Keywords not matching | Use kw() or explicit choice() with case variants |
| Preprocessor splits | MISSING tokens in #if contexts | Add dedicated preproc_split_* rule |
| Property syntax | Complex property fails | Add dedicated complex property rule |
| Keyword as identifier | Variable name conflicts | Add to keyword_as_identifier choice list |
Parser Metrics
Note: These metrics are approximate and may drift as the grammar evolves. Verify with wc -c src/parser.c and grep -E 'SYMBOL_COUNT|STATE_COUNT' src/parser.c if precision matters.
| Metric | Value |
|---|---|
| parser.c size | 23.5 MB |
| SYMBOL_COUNT | ~762 |
| STATE_COUNT | ~11,705 |
| grammar.js lines | ~3,400 |
| Tests | 1,404 |
| Production success | 100% (0 errors) |
| Named keywords | 82 |
| Query files | 5 (highlights, locals, tags, indents, folds) |
Documentation Resources
Available via MCP:
- business-central — AL Language syntax, objects, properties
- tree-sitter — Grammar development guide, API reference
Project docs:
docs/v2-blog-post-notes.md— V2 rewrite narrative and datadocs/superpowers/specs/— Design specs for major changes
Philosophy: No Known Limitations
Never give up on a failing pattern:
- Don't disable tests or mark issues as "known limitations"
- Research how other parsers handle similar constructs in
other-languages/ - Use
error-researchagent for systematic failure analysis - Every "impossible" pattern has been solved somewhere — find it and adapt it