Example Reference
June 1, 2026 ยท View on GitHub
The reference documentation for the included example grammars and programs.
Contents
Preamble
This page lists all included example gramamrs and programs. The examples are not considered part of the public interface subject to semantic versioning.
Grammars
The example grammars reside in include/tao/pegtl/example/.
abnf_abnf.hpp
Grammar for ABNF rules according to Section 4 of RFC 5234 as updated by RFC 7405.
- Extended with PEG 'and' and 'not' predicates.
- Modified to not allow C++ keywords as rule names.
abnf_core.hpp
Rules for the ABNF core rules according to Appendix B.1 of RFC 5234, Appendix B.
escaped.hpp
Rules for escape sequences in C and JSON strings ready for the unescape actions.
fp.hpp
A grammar for the textual representation of floating point numbers, suitable for std::stod() (without locale support).
http.hpp
HTTP 1.1 grammar according to RFC 7230.
ipv4.hpp
A grammar for IPv4 addresses; only supports four dot-separated octets, not the traditional notations with fewer dots.
ipv6.hpp
A grammar for IPv6 addresses including IPv4-mapped IPv6 addresses.
iri.hpp
IRI grammar according to RFC 3987.
json.hpp
JSON grammar according to RFC 7159 (UTF-8 only).
json_pointer.hpp
JSON pointer grammar according to RFC 6901 (UTF-8 only).
lua53.hpp
Grammar for the Lua 5.3 scripting language that combines lexer and parser.
proto3.hpp
Grammar for Protocol Buffers (Proto3).
semver2.hpp
Grammar for SemVer Versions 2.0.0.
uri.hpp
URI grammar according to RFC 3986.
Programs
The example programs can be found in src/example/.
abnf2pegtl.cpp
Parses ABNF (RFC 5234)-style grammars with the ABNF grammar and converts them into C++ PEGTL rules. Uses the command line arguments as files to parse. Some extensions and restrictions compared to RFC 5234:
- As we are defining PEGs, the alternations are now ordered (
sor<>). - The and- and not-predicates from PEGs have been added as
&and!, respectively. - A single LF is also accepted as line ending.
- C++ identifiers are formed by replacing the dashes in rulenames with underscores.
- Reserved identifiers (keywords, ...) are rejected.
- Numerical values must fit into the corresponding C++ data type.
abnf_record.cpp
Shows how to create a linearized record of a parsing run with the ABNF grammar. Uses the command line arguments as files to parse.
$ build/bin/example/abnf_record src/example/abnf.abnf
tao::pegtl::abnf::rulename@4:1(93) 'rulelist'
tao::pegtl::digit@4:19(111) '1'
tao::pegtl::abnf::rulename@4:23(115) 'rule'
...
analyze.cpp
A small example that provokes the grammar analysis to find problems.
behaviour.cpp
Generates the tables for the rule comparisons on the rule and grammars page.
calculator.cpp
A calculator with all binary operators from the C language that shows
- how to use stack-based actions to perform a calculation on-the-fly during the parsing run, and
- how to build a grammar with a run-time data structure for arbitrary binary operators with arbitrary precedence and associativity.
In addition to the binary operators, round brackets can be used to change the evaluation order. The implementation uses long integers as data type for all calculations.
$ build/src/example/calculator "2 + 3 * -7" "(2 + 3) * 7"
-19
35
In this example the grammar takes a bit of a second place behind the infrastructure for the actions required to actually evaluate the arithmetic expressions. The basic approach is "shift-reduce", which is very close to a stack machine, which is a model often well suited to PEGTL grammar actions: Some actions merely push something onto a stack, while other actions apply some functions to the objects on the stack, usually reducing its size.
chomsky_hierarchy.cpp
Examples of grammars for regular, context-free, and context-sensitive languages.
csv_1.cpp
csv_2.cpp
Two simple examples for grammars that parse different kinds of CSV-style file formats.
dispatch.cpp
A short example for the action dispatch facility.
dynamic_match.cpp
Shows a rule that uses run-time information to decide a match with a grammar similar to raw string.
expression.cpp
A work-in-progress expression evaulation example that supports prefix, infix and postfix operators as well as the ternary operator. The set of operators and their precedences can be easily adapted.
hello_world.cpp
The reverse "hello world" example from the introduction.
indent_aware.cpp
Shows one approach to implementing an indentation-aware language with a very very small subset of Python.
iri_struct.cpp
Shows how to use the IRI grammar to parse an IRI into a data structure. Parses its command line arguments.
json_analyze.cpp
Performs a grammar analysis on the JSON grammar to check for problems.
json_ast.cpp
Combines the JSON grammar with the parse tree to build a generic data structure while parsing.
json_build.cpp
Extends on json_parse.cpp by parsing JSON files into generic JSON data structure.
json_count.cpp
Shows how to use a simple custom control to create some parsing statistics while parsing JSON files.
json_coverage.cpp
Combines the JSON grammar with the rule coverage to show parsing statistics.
json_parse.cpp
Shows how to use the custom error messages defined in json_errors.hpp with the JSON grammar to parse command line arguments as JSON data.
json_print_debug.cpp
Calls the print_debug() function to list the rules of the JSON grammar.
json_print_names.cpp
Calls the print_names() function to list the rules of the JSON grammar.
json_record.cpp
Shows how to combine the JSON grammar with the record facility to create a linear record of selected rule matches.
json_stream.cpp
Shows how to use the JSON grammar with an auto-discarding stream input.
json_tokens.cpp
Shows how to split JSON parsing into separate lexer and parser stages as is common when not using the PEG formalism.
json_trace.cpp
Combines the JSON grammar with the parse trace to show how to trace a parse.
lua53_analyze.cpp
Performs a grammar analysis on the Lua 5.3 grammar to check for problems.
lua53_parse.cpp
Parses Lua 5.3 source files with the combined experimental Lua grammar. Uses the command line arguments as files to parse.
modulus_match.cpp
Shows how to implement a parsing rule from scratch, in this case using the simplified calling convention. Parses its command line arguments.
$ build/bin/example/modulus_match a b c
'a' is NOT a match
'b' is NOT a match
'c' is a match
parse_tree.cpp
An example for how to create a parse tree using <tao/pegtl/extra/parse_tree.hpp> with a simple expression grammar.
The example shows how to choose which rules will produce a parse tree node, which rules will store the content, and how to add additional transformations to the parse tree to transform it into an AST-like structure or to simplify it.
The output is in DOT format and can be converted into a graph.
$ build/src/example/parse_tree "(2*a + 3*b) / (4*n)" | dot -Tsvg -o parse_tree.svg
The above will generate an SVG file with a graphical representation of the parse tree.
proto3_analyze.cpp
Performs a grammar analysis on the Protocol Buffers (proto 3) grammar to check for problems.
proto3_parse.cpp
Shows how to parse Protocol Buffers files with the Protocol Buffers (proto 3) grammar. Uses the command line arguments as files to parse.
recover.cpp
An experiment in recovering from parse failures, see PEGTL issue 55 and the source code for a description.
s_expression.cpp
Defines and parses a simplified S-expression grammar. Also shows how to parse include files with nested parsing. Parses its command line arguments.
semver2_parse.cpp
Shows the SemVer Version grammar in action. Parses its command line arguments.
sum.cpp
Shows how to add comma-separated lists of floating-point numbers taken from std::cin.
$ echo "1, 2, 3.14159, 42e3" | build/bin/example/sum
Give me a comma separated list of numbers.
The numbers are added using the PEGTL.
Type [q or Q] to quit
parsing OK; sum = 42006.14159
symbol_table.cpp
Shows how to parse and store integers in a simple symbol table. Each symbol needs to be defined before it is assigned to. Uses the command line arguments as files to parse.
$ cat /tmp/ramdisk/symbol_table.txt
def foo;
def bar;
foo = 42;
bar = 23;
$ build/bin/example/symbol_table /tmp/ramdisk/symbol_table.txt
bar = 23
foo = 42
token_input_1.cpp
token_input_2.cpp
Show how to parse a sequence of tokens, rather than the usual sequence of char, where each token consists of an enum and a string.
unescape.cpp
Uses the building blocks from <tao/pegtl/contrib/unescape.hpp> to show how to actually unescape a string literal with various typical escape sequences.
Parses its command line arguments.
$ build/bin/example/unescape 'X\x22Y' 'X\u0022Y' 'X\"Y'
argv[ 1 ] = X"Y
argv[ 2 ] = X"Y
argv[ 3 ] = X"Y
uri_print_debug.cpp
Shows how to use print_debug() from include/tao/pegtl/debug/print.hpp to print all rules of the URI grammar.
uri_print_names.cpp
Shows how to use print_names() from include/tao/pegtl/debug/print.hpp to print all rules of the URI grammar.
uri_struct.cpp
Shows how to use the URI grammar to parse a URI into a data structure. Parses its command line arguments.
uri_trace.cpp
Shows how to use complete_trace from include/tao/pegtl/debug.trace.hpp to parse a URI with a complete trace.
Parses its command line arguments.
Index
abnf_abnf.hpp(grammar)abnf_core.hpp(grammar)abnf2pegtl.cpp(program)abnf_record.cpp(program)analyze.cpp(program)behaviour.cpp(program)calculator.cpp(program)chomsky_hierarchy.cpp(program)csv_1.cpp(program)csv_2.cpp(program)dispatch.cpp(program)dynamic_match.cpp(program)escaped.hpp(grammar)expression.cpp(program)fp.hpp(grammar)hello_world.cpp(program)http.hpp(grammar)indent_aware.cpp(program)ipv4.hpp(grammar)ipv6.hpp(grammar)iri.hpp(grammar)iri_struct.cpp(program)json.hpp(grammar)json_analyze.cpp(program)json_ast.cpp(program)json_build.cpp(program)json_count.cpp(program)json_coverage.cpp(program)json_parse.cpp(program)json_pointer.hpp(grammar)json_print_debug.cpp(program)json_print_names.cpp(program)json_record.cpp(program)json_stream.cpp(program)json_tokens.cpp(program)json_trace.cpp(program)lua53.hpp(grammar)lua53_analyze.cpp(program)lua53_parse.cpp(program)modulus_match.cpp(program)parse_tree.cpp(program)proto3.hpp(grammar)proto3_analyze.cpp(program)proto3_parse.cpp(program)recover.cpp(program)s_expression.cpp(program)semver2.hpp(grammar)semver2_parse.cpp(program)sum.cpp(program)symbol_table.cpp(program)token_input_1.cpp(program)token_input_2.cpp(program)unescape.cpp(program)uri.hpp(grammar)uri_print_debug.cpp(program)uri_print_names.cpp(program)uri_struct.cpp(program)uri_trace.cpp(program)
This page is part of the PEGTL and its documentation.
Copyright (c) 2014-2026 Dr. Colin Hirsch and Daniel Frey
Distributed under the Boost Software License, Version 1.0
See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt