Markup & Data Format Support

March 23, 2026 · View on GitHub

Lambda parses a wide variety of text and binary formats into a single, uniform in-memory representation — the Lambda/Mark node tree — so that all downstream transformation, validation, layout and rendering code works on the same data model regardless of the original source format.

[Source file]  →  [parser]  →  [Lambda/Mark tree]  →  [transform / validate / render]

This document covers three groups of supported formats:

  1. Lightweight markup languages — human-authored prose that maps to the Mark Doc element schema
  2. Data-interchange formats — structured data that maps to Lambda maps, arrays, and elements
  3. Other formats — PDF, email, calendars, graphs, CSS, and more

1. Lightweight Markup Languages

All lightweight markup flavors (Markdown, reStructuredText, AsciiDoc, …) are parsed into the same Mark Doc schema — an element tree rooted at <doc>. Block structure (headings, paragraphs, lists, code blocks, tables) and inlines (emphasis, links, code spans) are mapped to HTML-compatible element names where an HTML equivalent exists, with a small set of custom <mark> elements for features that have no direct HTML analogue (math, citations, footnotes). See Doc Schema for the full schema reference.

1.1 Supported Flavors

FormatInput type stringNotes
CommonMark / GitHub Flavored Markdownmarkdown100 % CommonMark test-suite pass rate
reStructuredTextrst
MediaWiki / DokuWiki markupwiki
AsciiDocasciidoc / adoc
Emacs Org-modeorg
Textiletextile
troff/man pagesman
MDX (Markdown + JSX)mdxinlines React/JSX components
LaTeXlatexsee also §3
HTML5html / html5100 % html5lib test-suite pass rate

In Lambda scripts all of these are loaded with input():

let md   = input("readme.md",    'markdown')
let rst  = input("design.rst",   'rst')
let wiki = input("page.wiki",    'wiki')
let adoc = input("guide.adoc",   'asciidoc')
let html = input("index.html",   'html')

1.2 Unified Mark Doc Schema

Every markup parser produces the same element-tree shape. Common block and inline element names are intentionally aligned with HTML so they pass directly to the Radiant layout engine without a translation step.

CategoryMark element(s)Notes
Document root<doc>carries <meta> as first child
Document metadata<meta title: … author: … date: …>unified across all flavors
Headings<h1><h6>
Paragraph<p>
Block quote<blockquote>
Code block<pre><code class:"language-X" …>
Horizontal rule<hr>
Unordered list / item<ul> / <li>
Ordered list / item<ol start:N> / <li>
Definition list<dl> / <dt> / <dd>
Table<table> / <thead> / <tbody> / <tr> / <th> / <td>
Figure / caption<figure> / <figcaption>
Bold / strong<strong>
Italic / emphasis<em>
Inline code<code>
Hyperlink<a href:…>
Image<img src:… alt:…>
Line break<br>
Math (inline)<math mode:'inline' …>custom Mark element
Math (display)<math mode:'display' …>custom Mark element
Footnote<footnote id:…>custom Mark element
Citation<cite keys:[…]>custom Mark element
Raw pass-through<raw format:'html' …>custom Mark element

1.3 Side-by-Side: Markdown → Mark Doc

Markdown sourceMark Doc (Lambda element tree)
# Hello World

Some **bold** and *italic* text
with a [link](https://example.com).

- item one
- item two

| Name  | Age |
|-------|-----|
| Alice | 30  |
| Bob   | 25  |
<doc
  <meta>
  <h1 "Hello World">
  <p
    "Some "
    <strong "bold">
    " and "
    <em "italic">
    " text with a "
    <a href:"https://example.com" "link">
    "."
  >
  <ul
    <li "item one">
    <li "item two">
  >
  <table
    <thead
      <tr <th "Name"> <th "Age">>
    >
    <tbody
      <tr <td "Alice"> <td "30">>
      <tr <td "Bob">   <td "25">>
    >
  >
>

1.4 HTML → Mark Doc

HTML5 is a first-class input format. The tree-structure is preserved verbatim; the <doc> wrapper is added for schema uniformity.

HTML5 sourceMark (Lambda element tree)
<!DOCTYPE html>
<html lang="en">
<head><title>Page</title></head>
<body>
  <h1 class="title">Hello</h1>
  <p>A <em>simple</em> page.</p>
</body>
</html>
<doc
  <html lang:"en"
    <head <title "Page">>
    <body
      <h1 class:"title" "Hello">
      <p "A " <em "simple"> " page.">
    >
  >
>

2. Data-Interchange Formats

Structured data formats (JSON, XML, YAML, TOML, CSV, …) map to Lambda's map ({…}), array ([…]), and element (<tag …>) literals. Once parsed you have a live Lambda value that you can query, transform, and export to any other format with a single pipeline.

2.1 Supported Formats

FormatInput type stringLambda data model
JSONjsonmaps + arrays
XMLxmlelement tree
YAML 1.2yamlmaps + arrays
TOMLtomlmaps + arrays
CSVcsvarray of maps
INIininested maps
Java .propertiespropertiesflat map
Key-value pairskvflat map

2.2 JSON → Lambda Map / Array

JSON is the closest format to Lambda's native literal syntax. Object keys become unquoted map keys; value types are preserved.

JSONLambda / Mark
{
  "name": "Alice",
  "age": 30,
  "active": true,
  "score": 9.5,
  "tags": ["dev", "python"],
  "address": {
    "city": "New York",
    "zip": "10001"
  },
  "nickname": null
}
{
  name:    "Alice",
  age:     30,
  active:  true,
  score:   9.5,
  tags:    ["dev", "python"],
  address: {
    city: "New York",
    zip:  "10001"
  },
  nickname: null
}

Loading and querying in Lambda:

let data = input("users.json", 'json')
data.name               // "Alice"
data.tags[0]            // "dev"
data.address.city       // "New York"

2.3 XML → Lambda Element Tree

XML documents map to Lambda elements — named nodes that carry both attributes (as key-value pairs) and an ordered list of children (text nodes or nested elements). This mirrors Mark Notation exactly.

XMLLambda / Mark
<?xml version="1.0"?>
<library>
  <book id="1" lang="en">
    <title>Clean Code</title>
    <author>Robert Martin</author>
    <year>2008</year>
  </book>
  <book id="2" lang="en">
    <title>The Pragmatic Programmer</title>
    <author>Hunt &amp; Thomas</author>
    <year>1999</year>
  </book>
</library>
<library
  <book id:1 lang:"en"
    <title "Clean Code">
    <author "Robert Martin">
    <year "2008">
  >
  <book id:2 lang:"en"
    <title "The Pragmatic Programmer">
    <author "Hunt & Thomas">
    <year "1999">
  >
>

Querying with the ? operator:

let lib = input("library.xml", 'xml')
lib?<book>              // all book elements
lib?<book lang:"en">    // books where lang == "en"
lib?<book> | ~.title    // ["Clean Code", "The Pragmatic Programmer"]

2.4 YAML → Lambda Map / Array

YAML documents (including multi-document streams) map to the same map/array model as JSON. All YAML scalar types (strings, ints, floats, booleans, nulls, timestamps) are converted to their Lambda equivalents.

YAMLLambda / Mark
server:
  host: localhost
  port: 8080
  tls: true

database:
  engine: postgres
  name: mydb
  pool: 5

features:
  - auth
  - logging
  - metrics
{
  server: {
    host: "localhost",
    port: 8080,
    tls:  true
  },
  database: {
    engine: "postgres",
    name:   "mydb",
    pool:   5
  },
  features: ["auth", "logging", "metrics"]
}

2.5 TOML → Lambda Map

TOML section headers become nested map keys; dotted keys are expanded into nested maps.

TOMLLambda / Mark
title = "My App"
version = "1.0.0"

[server]
host = "0.0.0.0"
port = 3000

[server.tls]
enabled = true
cert = "/etc/ssl/cert.pem"

[[plugins]]
name = "auth"
enabled = true

[[plugins]]
name = "cache"
enabled = false
{
  title:   "My App",
  version: "1.0.0",
  server: {
    host: "0.0.0.0",
    port: 3000,
    tls: {
      enabled: true,
      cert: "/etc/ssl/cert.pem"
    }
  },
  plugins: [
    {name: "auth",  enabled: true},
    {name: "cache", enabled: false}
  ]
}

2.6 CSV → Lambda Array of Maps

The first row is treated as a header and becomes the map keys for every subsequent row. Values are kept as strings unless numeric coercion is explicitly requested.

CSVLambda / Mark
name,age,city,score
Alice,30,New York,9.5
Bob,25,Los Angeles,8.2
Carol,35,Chicago,9.8
[
  {name:"Alice", age:"30", city:"New York",     score:"9.5"},
  {name:"Bob",   age:"25", city:"Los Angeles",  score:"8.2"},
  {name:"Carol", age:"35", city:"Chicago",      score:"9.8"}
]

Typical processing pipeline:

let rows = input("data.csv", 'csv')

// filter and project
for (r in rows where num(r.age) >= 30)
  {name: r.name, city: r.city}

2.7 INI / .properties / Key-Value

Flat or lightly-nested configuration formats map to one or two levels of maps.

INILambda / Mark
[database]
host = localhost
port = 5432
name = mydb

[cache]
host = localhost
port = 6379
ttl  = 300
{
  database: {
    host: "localhost",
    port: "5432",
    name: "mydb"
  },
  cache: {
    host: "localhost",
    port: "6379",
    ttl:  "300"
  }
}

.properties files (no sections) become a flat map:

// app.properties:  app.name=MyApp\napp.version=2.0
let cfg = input("app.properties", 'properties')
cfg."app.name"    // "MyApp"

2.8 Format Conversion

Any two supported data formats can be round-tripped through Lambda's format() function or the lambda convert CLI command:

// In a Lambda script
let data = input("config.yaml", 'yaml')
format(data, 'json')          // → JSON string
format(data, 'toml')          // → TOML string
# CLI
lambda convert config.yaml -t json -o config.json
lambda convert data.csv  -t yaml -o data.yaml
lambda convert page.md   -t html -o page.html

3. Other Formats

3.1 LaTeX

LaTeX source (.tex) is parsed by a Tree-sitter–based parser into an element tree closely following the Mark Doc schema, with custom elements for LaTeX-specific constructs (<math>, <cite>, <env name:…>, <cmd name:…>). The lambda view command renders .tex files by first converting them to HTML.

let doc = input("paper.tex", 'latex')
format(doc, 'html')           // convert to HTML

3.2 PDF

PDF documents are parsed into a Mark element tree with best-effort text flow reconstruction. Binary streams inside the PDF are decompressed before parsing.

let report = input("annual_report.pdf", 'pdf')
report?<p> | ~[0]             // first paragraph text

3.3 RTF

Rich Text Format documents are parsed into the same Mark Doc element schema, preserving text runs, paragraph styles, and basic table structure.

let doc = input("letter.rtf", 'rtf')

3.4 Email (EML / RFC 822)

E-mail files are parsed into a map with well-known header fields and a content body. MIME multipart messages yield the body as an array of parts.

EML source (excerpt)Lambda / Mark
From: alice@example.com
To: bob@example.com
Subject: Hello
Date: Mon, 1 Jan 2024 10:00:00 +0000
MIME-Version: 1.0
Content-Type: text/plain

Hi Bob, just checking in.
{
  from:    "alice@example.com",
  to:      "bob@example.com",
  subject: "Hello",
  date:    t'2024-01-01T10:00:00Z',
  body:    "Hi Bob, just checking in."
}

3.5 vCard (VCF)

Contact cards are parsed into maps following the vCard 3.0 / 4.0 property names.

let contact = input("alice.vcf", 'vcf')
contact.fn        // "Alice Wonderland"
contact.email     // "alice@example.com"
contact.tel       // "+1-555-0100"

3.6 iCalendar (ICS)

Calendar files are parsed into a map with a vcalendar root and an array of vevent, vtodo, and vjournal components.

let cal = input("events.ics", 'ics')
cal.vcalendar.vevent | ~.summary    // ["Team standup", "Sprint review", …]

3.7 Graph Formats

Diagram description languages are parsed into a graph element tree (<graph> containing <node> and <edge> elements). Three flavors are supported via the graph parser type:

FlavorInput typeFile extension
Graphviz DOTgraph / dot.dot, .gv
D2 diagramsgraph / d2.d2
Mermaid diagramsgraph / mermaid.mmd
let g = input("arch.dot", 'graph')
g?<node> | ~.id          // list all node IDs
g?<edge src:"a">         // edges leaving node "a"

3.8 CSS

CSS stylesheets are parsed into a rule list — an array of maps with selector and declarations fields.

CSS sourceLambda / Mark
body {
  font-family: sans-serif;
  margin: 0;
}

h1, h2 {
  color: #333;
  font-weight: bold;
}
[
  {
    selector: "body",
    declarations: {
      "font-family": "sans-serif",
      margin: "0"
    }
  },
  {
    selector: "h1, h2",
    declarations: {
      color: "#333",
      "font-weight": "bold"
    }
  }
]

3.9 JSX / MDX

JSX and MDX files interleave markup with code expressions. They are parsed into element trees where JSX component invocations become elements with the component name preserved as the tag, and {expression} slots are captured as <expr> children.

let page = input("Page.mdx", 'mdx')
page?<Card>               // all <Card> component usages

3.10 Math

Mathematical notation can be parsed standalone from LaTeX math or AsciiMath sources into a <math> element tree:

let expr = input("formula.tex", 'math')   // math-only LaTeX
let expr = input("formula.asc", 'math-ascii')

3.11 Directory Listing

A local directory path can be treated as an input, producing an array of file-info maps:

let files = input("./src", 'dir')
files that ~.ext == ".cpp" | ~.name     // list all .cpp filenames

4. Using the Input Function

All formats are accessed through the same input() built-in:

// Explicit type
let data = input("file.ext", 'format')

// Auto-detect from MIME / file extension
let data = input("data.json")

// From a URL (HTTP/HTTPS)
let data = input("https://api.example.com/data.json")

// From a string in memory
let data = input_str(raw_string, 'yaml')

CLI equivalent — lambda convert:

lambda convert input.yaml  -t json   -o output.json
lambda convert input.md    -t html   -o output.html
lambda convert input.csv   -t yaml   -o output.yaml

MIME auto-detection is built in: when no type string is given, Lambda inspects the file header bytes and extension to select the right parser automatically.


5. Format Summary

CategoryFormats
Lightweight markupMarkdown (GFM), HTML5, reStructuredText, AsciiDoc, Wiki, Org-mode, Textile, troff/man, MDX, LaTeX
Data interchangeJSON, XML, YAML 1.2, TOML, CSV, INI, Java .properties, key-value
Document / rich textPDF, RTF, LaTeX (.tex)
Personal datavCard (VCF), iCalendar (ICS), Email (EML / RFC 822)
Diagrams / graphsGraphviz DOT, D2, Mermaid
Web / codeCSS, JSX, Math (LaTeX math, AsciiMath)
SystemDirectory listing, plain text

All formats produce a Lambda/Mark node tree that can be uniformly queried with ?, transformed with pipes and for-expressions, validated against schemas, and exported to any other supported output format.


See also: Doc Schema · Data & Collections · System Functions · CLI Reference