tree-sitter-typst
June 28, 2026 ยท View on GitHub
tree-sitter-typst is a correct Tree-sitter parser for Typst. It parses
Typst's markup, code, and math syntax, including embedded code, content blocks,
equations, imports, set/show rules, closures, arrays, dictionaries, lists,
sections, labels, raw blocks, and math attachments.
Compared with earlier Typst grammars such as
uben0/tree-sitter-typst, this
project is intended to be a more complete parser that works for real Typst
documents, editor queries, injections, incremental parsing, and real-world
fixture validation.
The grammar is intentionally written as one self-contained grammar.js plus an
external scanner for the lexical decisions that Tree-sitter cannot express
well in pure LR grammar rules.
Generate And Test
Use Tree-sitter CLI 0.26.9 or newer:
npm install
npm run generate
npm test
npm run check
Useful focused checks:
npm run test:corpus
npm run test:queries
npm run test:incremental
npm run test:real-world
src/parser.c, src/grammar.json, and src/node-types.json are generated.
Do not hand-edit them.
The committed src/unicode_tables.h is generated scanner support data. Regenerate
it only when updating Unicode data:
npm run generate:unicode
The Unicode generator requires Python's regex module plus Unicode XID/number
data and UTR #25 MathClass data files. It searches vendor/unicode,
src/vendor/unicode, third_party/unicode, unicode, and data; set
TREE_SITTER_TYPST_UNICODE_DIR or TREE_SITTER_TYPST_MATH_CLASS for other
locations.
Root Modes
The default parser is named typst. It starts in markup mode and still parses
all three Typst modes in one grammar: markup, embedded code, and math.
The same grammar can also generate direct code and direct math root parsers for editor integrations that need those modes as injection targets:
npm run generate:variants
This writes variant grammars under:
build/typst
build/typc
build/typm
The typc and typm builds are companion root modes, not replacements for the
default parser. Raw language injections tagged typc or typm can use them
when an editor registers those parser names.
Queries
Editor queries live under queries/typst/:
highlights.scm: markup, code, math, calls, definitions, literals, operatorsinjections.scm: raw-language injectionslocals.scm: definitions, parameters, imports, and referencestags.scm: headings, labels, functions, variables, imports, and callsfolds.scm: foldable blocks and sectionsindents.scm: Neovim indentation capturesimages.scm: Snacks.nvim image and Typst math captures
The main highlight query follows Neovim's current tree-sitter capture
conventions. The Helix integration keeps a separate highlight query adapted to
Helix theme scopes instead of copying Neovim captures verbatim. Emacs likewise
uses separate treesit font-lock rules under editors/emacs/.
npm run test:queries compiles every query and verifies that every declared
capture is exercised by the audit fixture.
Scanner Design
The external scanner owns boundary-sensitive tokens that must coordinate with the LR parser:
- raw delimiter width and raw language/content/close scanning
- nested comments
- heading, bullet, numbered, and term markers
- list continuation and serialized list-marker indentation
- code newlines, continuation lookahead, immediate calls, and field access
- numeric tokens and unit adjacency
- markup word gaps and automatic links
- math words, text, spacing, fractions, arguments, and delimiters
Scanner state is intentionally small and serialized for incremental parsing.
Broad recovery states opt out through _error_sentinel so speculative recovery
does not mutate scanner state.
Scanner probes follow this convention: helpers that advance before deciding
must either be returned immediately by scanner_scan or handle all
same-position fallbacks internally. This avoids failed lookahead probes blocking
ordinary whitespace, newline, or recovery tokens.
Focused scanner coverage lives in:
test/scanner/scanner_test.c
test/corpus/scanner_edges.txt
Corpus And Real-World Fixtures
The corpus covers small syntax contracts, scanner edge cases, regressions, and a large synthetic Typst document:
test/corpus/
test/incremental/
test/fixtures/synthetic/
test/fixtures/real_world/
The real-world validator reports expected parser errors separately from unexpected ones:
npm run test:real-world
Benchmarks
Criterion benchmarks cover full parsing, incremental edits, and query execution over synthetic and real-world fixtures:
cargo bench
The benchmark entry point is benches/bench_main.rs.
Nix
With Nix, enter the development shell or build the C grammar package:
nix develop
nix build
Legacy commands are also supported:
nix-shell
nix-build
Neovim
Neovim support is planned through
SeniorMars/typst.nvim, a
work-in-progress plugin for using this parser and its editor queries from
Neovim.
Until that plugin is ready, the maintained queries in queries/typst/ can be
used as the source for a manual Tree-sitter setup. Tree-sitter provides syntax
parsing; Tinymist provides language-server features.
Emacs
Use Emacs 29 or newer and register the grammar with built-in treesit:
(add-to-list
'treesit-language-source-alist
'(typst "https://github.com/SeniorMars/tree-sitter-typst" "main"))
Then run M-x treesit-install-language-grammar RET typst RET. Configure
typst-ts-mode according to that package's current documentation.
typst-ts-mode embeds font-lock queries for another Typst grammar, so using it
unchanged with this parser can fail with node errors such as (comment).
treesit-install-language-grammar installs only the parser library, not the
compatibility Elisp file. Make tree-sitter-typst-font-lock.el available on
load-path, either from a checkout of this repository or by copying that one
file into your Emacs configuration, then install the override after
typst-ts-mode loads:
mkdir -p ~/.emacs.d/lisp
curl -L \
-o ~/.emacs.d/lisp/tree-sitter-typst-font-lock.el \
https://raw.githubusercontent.com/SeniorMars/tree-sitter-typst/main/editors/emacs/tree-sitter-typst-font-lock.el
(add-to-list 'load-path (expand-file-name "lisp" user-emacs-directory))
(with-eval-after-load 'typst-ts-mode
(require 'tree-sitter-typst-font-lock)
(tree-sitter-typst-font-lock-apply-to-typst-ts-mode))
Helix
A ready-to-copy Helix integration lives under:
editors/helix/
It includes:
languages.toml: parser registration, file types, auto-pairs, indentation, and Tinymist language-server configurationqueries/: Helix query files for highlights, injections, indentation, folds, locals, tags, textobjects, and rainbow brackets
Helix highlights intentionally use Helix capture conventions and theme scopes,
while the main queries/typst/highlights.scm targets Neovim conventions.
Copy or merge the integration into your Helix config:
mkdir -p ~/.config/helix/runtime/queries/typst
cp editors/helix/languages.toml ~/.config/helix/languages.toml
cp editors/helix/queries/*.scm ~/.config/helix/runtime/queries/typst/
Then build the grammar:
hx --grammar fetch
hx --grammar build
Tree-sitter provides syntax parsing; Tinymist provides language-server features.