README.md

May 9, 2026 ยท View on GitHub

Tiny Lua Compiler (TLC)

An educational Lua 5.1 compiler, bytecode emitter, and VM in one Lua file

Inspired by Jamie Kyle's The Super Tiny Compiler

License: MIT Lua

Tiny Lua Compiler (TLC) is a complete Lua 5.1 compiler written in pure Lua. It tokenizes source code, builds an AST (Abstract Syntax Tree), lowers it into Lua 5.1 function prototypes, emits real Lua 5.1 bytecode, and can execute those prototypes in its own register-based VM. The whole core lives in tlc.lua.

Most compiler learning material falls into one of two buckets. On one side are toy compilers that are easy to finish but skip the parts that make real languages interesting. On the other are production compilers that are real, but so large that the main ideas get buried under architecture and history. TLC is meant to sit in the middle. It is small enough that you can read it in a weekend, but real enough to deal with lexical scoping, closures, upvalues, varargs, multiple returns, method calls, loops, tail calls, bytecode encoding, and execution.

It is not a production compiler, and it is not trying to replace the standard Lua implementation. It is an educational compiler that tries to stay honest: small enough to understand, complete enough to be worth studying.

It can compile itself

TLC can compile its own source code and run the result inside its own VM:

local tlc  = require("tlc")
local tlc2 = tlc.run(io.open("tlc.lua"):read("*a"))
tlc2.run("print('Hello from a compiler running inside itself')")

That means a compiler written in Lua is compiling a compiler written in Lua, and then the compiled compiler is running new Lua code, all without leaving the host process.

Try it

git clone https://github.com/bytexenon/Tiny-Lua-Compiler.git
cd Tiny-Lua-Compiler

# Run the code inside TLC's own VM.
lua5.1 -e "require('tlc').run(\"print('Hello from TLC!')\")"

# Compile to a binary .luac chunk and run it with the standard Lua VM.
lua5.1 -e "io.open('out.luac','wb'):write(require('tlc').compile('print(42)'))"
lua5.1 out.luac

lua5.1 tests/test.lua

You can also use TLC as a library, at whatever level of detail you need:

local tlc = require("tlc")

-- One-liner: compile and run.
tlc.run("print('Hello from TLC!')")

-- Compile to a binary .luac chunk that the standard Lua VM can load.
local bytecode = tlc.compile("return 21 * 2")
-- io.open("out.luac", "wb"):write(bytecode) -- Save to disk if you want.

-- Walk the pipeline stage by stage.
local tokens = tlc.tokenize("local x = 1 + 2; return x")
local ast    = tlc.parseTokens(tokens)
local proto  = tlc.generate(ast)
local value  = tlc.execute(proto)

print(value) -- 3

Why this file is worth reading

The code runs in a straight line. Utilities first, then the tokenizer, the parser, the code generator, the bytecode emitter, the VM, and the public API - in that order, nothing out of place. You can trace a single source program through every stage without losing the thread.

The implementation also keeps the details that toy compilers skip. Character classification uses precomputed lookup tables. Operator matching uses a trie for longest-prefix matching - no hand-rolled lookahead. Expressions go through precedence climbing rather than a grammar rule per level. Concatenation chains are flattened into a single CONCAT. Floating-point numbers are packed to IEEE 754 by hand, without string.pack. Upvalue capture and OP_CLOSE are handled explicitly.

These are not polish. They are where real compiler behavior starts to show up. Skip them and you learn the shape of compilation. Keep them and you learn how it actually works.

What TLC covers, and what it does not

TLC covers a large enough slice of Lua 5.1 to feel real:

  • Lexical scoping, closures, upvalue capture and closing
  • Numeric and generic for, while, repeat, do, break, return
  • if / elseif / else
  • Method calls (: syntax), table constructors
  • Multiple returns, varargs (...), tail call optimization
  • Long strings, string escapes, hex numbers, scientific notation
  • Full Lua 5.1 bytecode emission - output loads in the standard VM

What it deliberately leaves out is just as important. No constant folding. No debug information - that is the table mapping each instruction to a source line; without it, error messages show no line numbers, but the bytecode is otherwise correct.

The biggest omission is metamethod dispatch. Write a + b when a is a table and standard Lua checks for __add. TLC's VM skips this entirely - operators work only on native types. That removes a real feature, but it keeps the VM from becoming an object system.

The tradeoff is deliberate. TLC is trying to be a real compiler you can actually finish reading.

Correctness

The test suite compiles each case with both TLC and standard Lua, then compares the results side by side. No mock expectations - if TLC produces different output, the test fails.

This catches the mistakes educational compilers usually get away with: wrong operator precedence, broken closure semantics, multi-return adjustment errors, loop control flow bugs, and incorrect literal parsing, among others.

API

local tlc = require("tlc")

tlc.run(code, env?, ...?)
tlc.compile(code)
tlc.compileToProto(code)
tlc.parse(code)
tlc.tokenize(code)

tlc.parseTokens(tokens)
tlc.generate(ast)
tlc.emit(proto)
tlc.execute(proto, env?, ...?)

docs/api.md documents the public API and docs/ast.md documents the AST shape.

Where to start

Read this file for the big picture, then read tlc.lua from top to bottom. After that, docs/api.md and docs/ast.md fill in the reference material, and tests show the behavioral surface area.

TLC runs on Lua 5.1 through 5.5, although the generated bytecode targets Lua 5.1.

Contributions are welcome; see CONTRIBUTING.md. If you report a bug, please include the input code, expected behavior, actual behavior, and Lua version.

See also

  • The Super Tiny Compiler - the original inspiration; a compiler written in JavaScript in ~200 lines
  • FiOne - a Lua-in-Lua VM, more complete than TLC's but less focused on readability
  • Lua 5.1 source - the reference implementation; llex.c, lparser.c, and lvm.c are the most relevant files

License

MIT. See LICENSE.