Contributing to Tree-Sitter Language Pack
May 2, 2026 · View on GitHub
Thank you for your interest in contributing to tree-sitter-language-pack! This guide will help you get started with development.
Table of Contents
- Development Setup
- Development Workflow
- Parser Caching
- Adding Languages
- E2E Tests
- Exploring Tasks
- Code Quality
- Submitting Changes
Development Setup
Task Installation
This project uses Task for task automation and orchestration. Task is a task runner that simplifies development workflows across multiple languages and platforms.
Install Task
Choose the installation method for your platform:
macOS (Homebrew):
brew install go-task
Linux:
# Using the installer script
sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b ~/.local/bin
# Or via package managers:
apt install go-task # Debian/Ubuntu
pacman -S go-task # Arch
Windows:
# Using Scoop
scoop install task
# Or using Chocolatey
choco install go-task
For complete installation instructions, visit the official Task documentation.
Quick Start
After installing Task, set up your development environment:
# One-time setup - installs all dependencies
task setup
# Clone grammar sources
task clone
# Build in dev mode (a few languages, fast iteration)
task build:dev
The setup command will install Rust, Python, Node.js, Go, Java, and Elixir tooling as needed.
Development Workflow
Common Commands
# Build all crates (all languages, dynamic mode)
task build
# Build in dev mode (few languages, fast iteration)
task build:dev
# Build in release mode (optimized)
task build:release
# Run all tests
task test
# Run all checks (lint + test)
task check
# Format all code
task format
# Run all linters via prek
task lint
# Generate READMEs from templates
task generate-readme
# Update all dependencies
task update
# Clean all build artifacts
task clean
Language-Specific Tasks
Each language binding has its own namespace:
Rust:
task rust:build
task rust:test
task rust:format
task rust:lint
Python:
task python:install
task python:test
task python:format
task python:lint
Node.js:
task node:build # Build NAPI-RS native module (release)
task node:build:dev # Build in debug mode
task node:test
Go:
task go:build # Build Go bindings (requires FFI)
task go:build:ffi # Build FFI static library for Go
task go:test
task go:format
task go:lint
Java:
task java:build:ffi # Build FFI shared library for Java
task java:test
Elixir:
task elixir:build # Compile (includes Rustler NIF)
task elixir:test
task elixir:deps
Ruby:
task ruby:build # Build Ruby native extension
task ruby:test # Run Ruby tests
task ruby:format # Format Ruby code
task ruby:lint # Lint Ruby code
WebAssembly:
task wasm:build # Build WASM package (web target)
task wasm:build:bundler # Build WASM package (bundler target)
task wasm:build:node # Build WASM package (Node.js target)
task wasm:test # Run WASM tests
C:
task c:build:ffi # Build FFI library for C tests
task c:e2e:build # Build C E2E tests
task c:e2e:test # Run C E2E tests
Parser Caching
Cloning 305 tree-sitter grammar repositories is slow. The build system includes a multi-layer caching strategy to avoid redundant work.
How It Works
-
Cache manifest (
parsers/.cache_manifest.json): Tracks a SHA256-based key for each language derived from its full configuration (repo URL, revision, branch, directory, generate flag, ABI version). On subsequent runs, only languages whose configuration has changed — or whose parser files are missing from disk — are re-cloned. -
CI cache (
actions/cache@v4): All CI workflows cache theparsers/directory keyed onsources/language_definitions.json. When definitions haven't changed between runs, the clone step completes instantly. -
Stale entry cleanup: If a language is removed from
language_definitions.json, the next run deletes its parser directory and manifest entry automatically.
Environment Variables
| Variable | Default | Description |
|---|---|---|
TSLP_CACHE_DIR | <project_root>/parsers | Override the directory where compiled parser sources are stored |
TSLP_VENDOR_DIR | <project_root>/vendor | Override the directory where grammar repos are cloned |
TSLP_NO_CACHE | (unset) | Set to 1, true, or yes to force a full re-clone, ignoring the cache manifest |
Common Scenarios
# Normal clone (uses cache, only re-clones changed languages)
task clone
# Force full re-clone (deletes parsers/ and vendor/, re-clones everything)
TSLP_NO_CACHE=1 task clone
# Use a custom cache directory (useful for shared CI caches)
TSLP_CACHE_DIR=/tmp/tslp-parsers task clone
Adding Languages
License Requirements
All included grammars must be released under a permissive open-source license such as MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, ISC, or Unlicense. We do not accept grammars licensed under GPL, AGPL, LGPL, MPL, or any other copyleft license. This policy ensures that tree-sitter-language-pack can be freely used in any project — commercial or otherwise — without imposing license obligations on downstream users.
Before proposing a new grammar, verify its license by checking the LICENSE file in the grammar repository.
Steps
-
Add a language definition to
sources/language_definitions.json:{ "language_name": { "repo": "https://github.com/tree-sitter/tree-sitter-language", "rev": "commit-hash", "branch": "main", "directory": "path/to/src", "generate": false } }Fields:
repo(required): Repository URLrev(required): Specific commit hash for reproducible buildsbranch(optional): Branch name if not "main"directory(optional): Path to src folder if not in rootgenerate(optional): Run tree-sitter generate command
-
Add a Cargo feature for the language in
crates/ts-pack-core/Cargo.toml -
Clone and build
task clone task build:dev -
Regenerate E2E smoke fixtures and test
task e2e:generate:smoke-fixtures task e2e:generate:all task test
E2E Tests
E2E tests are generated from JSON fixtures in tools/e2e-generator/fixtures/ and produce runnable test suites for each language binding.
# Generate E2E tests for all languages
task e2e:generate:all
# Generate for a specific language
task e2e:generate:rust
task e2e:generate:python
task e2e:generate:go
task e2e:generate:java
task e2e:generate:elixir
task e2e:generate:ruby
task e2e:generate:c
# Run Rust E2E tests
task e2e:test:rust
# Auto-generate smoke fixtures from language_definitions.json
task e2e:generate:smoke-fixtures
Generated test files in e2e/ should not be edited directly — modify fixtures or the generator source instead.
Exploring Tasks
# Show all available tasks
task --list
# Show all tasks including internal ones
task --list-all
Code Quality
Pre-commit Hooks
The project uses prek for pre-commit hooks:
# Install hooks
prek install
prek install --hook-type commit-msg
# Run all hooks manually
prek run --all-files
Commit Messages
We use conventional commits:
feat: add support for tree-sitter-languagefix: correct parser initialization for languagedocs: update installation instructionschore: update dependenciestest: add tests for new language
Submitting Changes
-
Create a feature branch
git checkout -b feat/add-language-support -
Make your changes and run checks locally:
task check -
Commit and push
git commit -m "feat: add support for new language" git push origin feat/add-language-support -
Create a Pull Request — link any related issues and ensure CI passes.
Maintenance Tasks
Updating Language Versions
# Update all languages to latest revisions
uv run --no-sync scripts/pin_vendors.py
# Update only missing revisions
uv run --no-sync scripts/pin_vendors.py --only-missing
# Update specific languages
uv run --no-sync scripts/pin_vendors.py --languages=python,rust,go
Version Synchronization
Version is managed in Cargo.toml workspace and synced across all manifests:
task version:sync
Questions?
- Check existing issues
- Join our Discord community
Thank you for contributing to tree-sitter-language-pack!