SQLRite developer guide
May 11, 2026 · View on GitHub
A small, hand-written guide to the SQLRite codebase — how it's structured, how the pieces fit together, why the design choices were made, and how to work on the project.
Start here
- Getting started — install toolchain, build, run the REPL, your first
CREATE TABLE - Using SQLRite — REPL flow, meta-commands, history, launch modes
- Supported SQL — canonical reference for every statement, operator, and edge case the engine executes today (plus what's not supported yet)
- Desktop app — downloads, unsigned-installer bypass steps, and the Tauri architecture
- Smoke test — step-by-step walkthrough to sanity-check REPL + desktop app after any non-trivial change
- Architecture — high-level layer diagram and module map
- Design decisions — the "why" behind the major choices
- Roadmap — what's done, what's next, and the long-term arc
Using SQLRite as a library
- Embedding — the public
Connection/Statement/RowsAPI (Phase 5a) and where the non-Rust SDKs plug in (Phase 5b – 5g) - Concurrent writes — MVCC +
BEGIN CONCURRENT— Phase 11 canonical reference: SQL surface, embedding API, SDK error mapping, REPL meta-commands, durability story, limitations. Design rationale lives in the historical plan-doc. examples/— runnable Rust quickstart (cargo run --example quickstart) + concurrent-writers retry-loop demo (cargo run --example concurrent_writers); language-specific subdirectories fill in as each 5x sub-phase lands
Phase 7 — AI-era extensions
- Ask — natural-language → SQL — the canonical reference for the
ask()feature across every product surface (REPL, desktop, Rust library, Python / Node / Go / WASM SDKs, MCP server); env vars, defaults, prompt caching, security - Ask backend proxy templates — copy-paste backend examples for the WASM SDK's split design: Cloudflare Workers, Vercel Edge, Deno Deploy, Firebase Functions, AWS Lambda, Express, pure Node
- MCP server (
sqlrite-mcp) — Phase 7h + 8e: SQLRite as a Model Context Protocol stdio server. Wiring into Claude Code / Cursor /mcp-inspector; the eight tools (list_tables,describe_table,query,execute,schema_dump,vector_search,bm25_search,ask); read-only mode; the JSON-RPC wire format
Phase 8 — Full-text search + hybrid retrieval
- FTS — full-text search + hybrid retrieval — the canonical reference for
CREATE INDEX … USING fts, thefts_match/bm25_scorescalar functions, thetry_fts_probeoptimizer hook, hybrid retrieval via raw arithmetic withvec_distance_cosine, persistence + the on-demand v4 → v5 file-format bump, and thebm25_searchMCP tool
Benchmarks
- Benchmarks — the canonical reference for the SQLRite-vs-SQLite-and-friends bench suite: how to run it (
make bench/make bench-duckdb), what the twelve workloads measure, headline numbers, the engineering debts the suite surfaced (SQLR-18 / 19 / 20 / 21), reproducing a run, and thecompare.pydiff tool
Internals
These documents go into the implementation of each subsystem.
- File format — the
.sqlriteon-disk layout, byte by byte - Pager — page cache, diffing commits, WAL-backed commits, and the checkpointer
- Storage model —
Table,Column,Row,Index, how rows are reassembled - SQL engine — parser → executor pipeline, expression evaluator, NULL handling
- Desktop app — the Tauri 2.0 + Svelte shell under
desktop/
Project state
As of May 2026, SQLRite has:
- A working SQL engine (in-memory + on-disk with a real B-Tree per table + secondary indexes, Phases 0 – 3 complete)
- WAL-backed persistence with crash-safe checkpointing, shared/exclusive lock modes, and real
BEGIN/COMMIT/ROLLBACKtransactions (Phase 4 complete) - A stable public Rust embedding API plus C FFI shim and SDKs for Python, Node.js, Go, and WASM (Phase 5 complete except the optional 5f crate-polish task)
- A Tauri 2.0 + Svelte desktop app (Phase 2.5 complete)
- AI-era extensions across the product surface (Phase 7 complete): VECTOR columns + HNSW indexes (7a-7d), JSON columns (7e), the
ask()natural-language → SQL family across the REPL / desktop / Rust / Python / Node / Go / WASM (7g.1-7g.7), and thesqlrite-mcpModel Context Protocol server (7h + 7g.8) - Full-text search + hybrid retrieval (Phase 8 complete): FTS5-style inverted index with BM25 ranking +
fts_match/bm25_scorescalar functions +try_fts_probeoptimizer hook + on-disk persistence with on-demand v4 → v5 file-format bump (8a-8c), a worked hybrid-retrieval example combining BM25 with vector cosine via raw arithmetic (8d), and abm25_searchMCP tool symmetric withvector_search(8e). Seedocs/fts.md. - SQL surface + DX follow-ups (Phase 9 complete, v0.2.0 → v0.9.1): DDL completeness —
DEFAULT,DROP TABLE/DROP INDEX,ALTER TABLE(9a); free-list + manualVACUUM(9b) + auto-VACUUM (9c);IS NULL/IS NOT NULL(9d);GROUP BY+ aggregates +DISTINCT+LIKE+IN(9e); four flavors ofJOIN— INNER, LEFT, RIGHT, FULL OUTER (9f); prepared statements +?parameter binding with a per-connection LRU plan cache (9g); HNSW probe widened to cosine + dot viaWITH (metric = …)(9h);PRAGMAdispatcher with theauto_vacuumknob (9i) - Benchmarks against SQLite + DuckDB (Phase 10 complete, SQLR-4 / SQLR-16): twelve-workload bench harness with a pluggable
Drivertrait, criterion-driven, pinned-host runs published. Seedocs/benchmarks.md. - Phase 11 (concurrent writes via MVCC +
BEGIN CONCURRENT, SQLR-22) is shipped end-to-end —ConnectionisSend + Sync;Connection::connect()mints sibling handles.sqlrite::mvccexposesMvccClock,ActiveTxRegistry,MvStore,ConcurrentTx, and theMvccCommitBatch/MvccLogRecordWAL codec. WAL header v1 → v2 persisted the clock high-water mark; v2 → v3 added typed MVCC log-record frames.PRAGMA journal_mode = mvcc;opts a database into MVCC.BEGIN CONCURRENTwrites commit-validate againstMvStore, abort withSQLRiteError::Busy, and append a typed MVCC log-record frame to the WAL — covered by the same fsync as the legacy page commit. Reopen replays those frames intoMvStoreand seedsMvccClockpast the highest committedcommit_ts. Reads viaStatement::querysee the BEGIN-time snapshot. Per-commit GC +vacuum_mvcc()bound version-chain growth. C FFI / Python / Node / Go all propagateBusy/BusySnapshotas typed retryable errors and mint sibling handles that share backing state — Go's process-level path registry (Phase 11.11c) handles cross-*sql.DBsharing too. ThesqlriteREPL ships.spawn/.use/.connsfor interactive demos; the SQLR-16 benchmark suite addsW13(concurrent writers, mostly disjoint rows) as the Phase-11 differentiator workload. The only remaining items are deferred-by-design or foundation work: indexes under MVCC (11.10) and the checkpoint-drain follow-up (parked half of 11.9). User-facing reference:docs/concurrent-writes.md; runnable example atexamples/rust/concurrent_writers.rs. Original design proposal:docs/concurrent-writes-plan.md. - A fully-automated release pipeline that ships every product to its registry on every release with one human action — Rust engine +
sqlrite-ask+sqlrite-mcpto crates.io, Python wheels to PyPI (sqlrite), Node.js + WASM to npm (@joaoh82/sqlrite+@joaoh82/sqlrite-wasm), Go module viasdk/go/v*git tag, plus C FFI tarballs, MCP binary tarballs, and unsigned desktop installers as GitHub Release assets (Phase 6 complete)
See the Roadmap for the full phase plan.
Release engineering
- Release plan — Phase 6 design doc: lockstep versioning, PR-based release flow, OIDC trusted publishing, the version-bump surface
- Release secrets runbook — one-time web-UI setup for crates.io, PyPI, npm, GitHub
releaseenvironment, andmainbranch protection scripts/— runnable tooling used by release workflows + reproducible locally (start withscripts/bump-version.sh)
Future work
- Phase 7 plan — AI-era extensions (vector column type + HNSW, JSON, NL→SQL
ask()API across REPL/library/SDKs/desktop/MCP, MCP server). Implementation complete except 7f, which deferred to Phase 8. - Phase 8 plan — Full-text search (FTS5-style BM25) + hybrid retrieval. The deferred 7f scope. All six sub-phases (8a–8f) shipped. Canonical reference:
docs/fts.md. - Benchmarks plan — design rationale + the resolved Q1–Q8 decisions for the bench suite. Historical reference; the user-facing canonical doc is
docs/benchmarks.md(above). All six sub-phases (9.1–9.6) shipped.
Conventions
- Code lives under
src/; docs live here underdocs/. - Commit messages carry a
[path]prefix describing the areas touched (seegit log). - Every non-trivial change lands with tests. The suite is run on every commit via
cargo test. - The engine is both a library (
src/lib.rs— thesqlritecrate) and a binary (src/main.rs— the REPL). External consumers should import it as a library.
If you're reading the code and a piece feels surprising, check this guide first — most non-obvious decisions are documented under Design decisions.