flatdata Schema Language

May 26, 2026 · View on GitHub

Basic Types

Flatdata supports the following primitive types:

bool - boolean data type
i8 - signed 8-bit wide type
u8 - unsigned 8-bit wide type
i16 - signed 16-bit wide type
u16 - unsigned 16-bit wide type
i32 - signed 32-bit wide type
u32 - unsigned 32-bit wide type
i64 - signed 64-bit wide type
u64 - unsigned 64-bit wide type

Constants

Flatdata supports defining constants of basic types:

const <type> <name> = <value>;

Enumerations

Flatdata supports adding enumeration over basic types. Each enumeration value can either automatically be assigned a value (previous value +1, starting with 0), or manually.

Each enumeration is defined as follows:

enum <name> : <type> [ : bits ] {
    <value name> [= value],
    ...
}

Flatdata will auto-generate names for all missing values as UNKNOWN_VALUE_{X} or UNKNOWN_VALUE_MINUS_{X}, e.g. if no value for 5 is specified, and 5 is possible to represent in the specified number of bits the generator will generate UNKNOWN_VALUE_5 = 5. The main reason for this behaviour is, that reading from files is inherently untrustworthy: While the value is not mentioned in the schema, nothing prevents a malicious entity from writing it.

The following restrictions for values are checked:

No duplicate values
Values must fit into the underlying type
Most possible values should be listed/named (>=50% +- 256), e.g. a u16 should have at least $2^{16}$ values

Structures

Flatdata structure definition syntax resembles known alternatives, albeit with notable differences:

Backward compatibility: flatdata format does not have backward compatibility support built in. It is not meant to be used directly in communication protocols, there are libraries which are well-known and well-suited for that purpose. Flatdata is a create-once read-extensively storage library.
Bit fields: Unlike with some other formats, bitfields are supported natively in a platform- independent fashion.

Each structure is defined as follows:

struct <Name> {
    <field> : <type> : <width>;
    ...
}

<type> can either be a basic type, or an enumeration.

Example:

struct Structure {
    field1 : u32 : 29;
    field2 : u8 : 2;
}

Every structure field specifies target language type to represent the field and its target size in bits. Flatdata takes care of packing and aligning the structures correctly as well as accessing them efficiently.

Resources

Archive resources can be one of following types:

T - a single structure of given type
vector< T > - a vector of structures of a given type.
multivector< IndexSize, T1, T2, ... > - a heterogenuous associative container for storing multiple properties for a single entity. Allows efficient storage of the data whose properties are sparsily assigned to each item. Think of it as a multimap of variants. IndexSize is the number of bits used for indexing the entities. An index is addressing the start of the offset of a variant in the data.
raw_data - Uninterpreted raw data. Useful for storing arrays of non-numeric data like strings referenced from structures.
archive ArchiveName - Archive resource. Archive resources allow to structure large archives better, while also acting as a namespace and grouping optionality semantics. Referenced archive type has to be defined.

Comments

Flatdata schema supports C++-style comments. Comments located before structures/archives or their members will be available in generated code. Example:

/// A single secret. Might be important
struct Secret { importance : u64 : 64; }

/**
    * Very important archive
    */
archive TheBookOfSecrets {
    // More important secret
    secret1 : Secret;
    // Less important secret
    secret2 : Secret;
}

Decorations

Decorations declare additional properties of entities they are applied to. Decorations supported at the moment are described below. Note that not all target languages provide full support for all decorations. For example, dot generator uses decorations to group archive resources and create reference edges, while other generators mostly support only @optional.

Nonetheless, decorations are first-class citizens of schema and thus are validated as well during archive opening.

Constant Referencing

@const( <name> ) can be added to fields of a structure to indicate in which locations a constant can appear, e.g.:

const u32 MY_CONST = 10;
struct MyStruct {
    @const( MY_CONST )
    my_value : u32 : 16;
}

Note: If a constant is not referenced anywhere, flatdata will assume that it is a global constant, and include it into the schema of every resource of every archive.

Optional

Resources

@optional can be applied to resources. If resource is optional and missing, archive can still be opened successfully. Resource of any type can be optional. Example:

archive Archive {
    @optional
    resource: vector< SomeStructure >;
}

Fields

@optional( <name> ) can be added to a field to mark a special constant value of the field as a sentinel value, making the whole field optional.

This special value is considered the none value of the field. Many language backends will use native optional data structures for such fields instead of the underlying integer type.

Explicit Reference

@explicit_reference declares an explicit reference of one resource's property to another resource. This is a very common type of referencing in flatdata and can be seen as a "Foreign Key", with the exception that consistency of the key is not enforced.

It is possible to define explicit reference with its target in a different archive, as long as it is defined.

Example:

struct Person {
    name : u64 : 64;
    first_child : u64 : 64;
}

archive Archive {
    @explicit_reference( Person.name, names )
    @explicit_reference( Person.first_child, children )
    people: vector< Person >

    children: vector< Child >

    names: raw_data
}

Bound Implicitly

Sometimes it is useful to split structures' fields into multiple resources (for example, to promote data locality in case binary search is done extensively on a particular field). @bound_implicitly declares that such resources are grouped implicitly and therefore represent a single entity. The decoration also gives entity a name

@bound_implicitly( transactions: keys, transaction_data )
archive Archive {
    keys: vector< Key >
    transaction_data : vector< Transaction >
}

Entity Referencing

Resources and decorations can reference other entities declared in the schema. Types can be specified either with fully-qualified path or with local path, for example:

namespace N {
    struct T {
        ...
    }

    archive Archive {
        // Local path
        resource: vector< T >
        // Fully-qualified path
        another_resource: vector< .N.T >
    }
}

Local paths must be available in the current namespace.

Index Ranges

When flattening a data model into a flatdata schema one often encounters a pattern of storing index ranges as members of consecutive vector items:

struct Node {
    ...
    first_edge_ref : u32;
}

struct Edge {
    ...
}

archive Archive {
    // contains sentinel
    @explicit_reference( Nodes.first_edge_ref, edges )
    nodes : vector< Nodes >;
    edges : vector< Edges >;
}

In this case the edges of a node i are then retrieved as

edges.slice(nodes[i].first_edge_ref..nodes[i + 1].first_edge_ref)

Additionally the last element of the nodes vector is usually a sentinel (only used to retrieve first_edge_index). To simplify this flatdata offers the @range(name_of_range_attribute) annotation:

struct Node {
    ...
    @range(edges_range)
    first_edge_ref : u32;
}

This will have two effects:

Adding edges_range attribute exposing range (nodes[i].first_edge_ref, nodes[i + 1].first_edge_ref)
Hiding the sentinel in views (it still needs to be populated first, though)

Retrieving all edges is now as easy as this:

edges.slice(nodes[i].edges_range)

Imports

Schemas can be split across multiple files using import statements. An import pulls in all definitions (structs, enums, constants, archives) from another file, making them available for use in the importing file.

import "path/to/types.flatdata";

Import statements must appear at the top of the file, before any namespace or type definitions.

Path Resolution

Import paths are resolved relative to the file containing the import statement:

import "types.flatdata";            // same directory
import "sub/geo_types.flatdata";    // subdirectory
import "../shared/common.flatdata"; // parent directory

Diamond and Cyclic Imports

Diamond imports (the same file imported via multiple paths) are deduplicated automatically. Cyclic imports are also supported — a parent archive schema can import a child schema that imports the parent back.

Generated Code

For C++ and Rust, the generator uses separate compilation: only types from the root file are emitted, with include/import directives referencing the separately generated imported files. Each .flatdata file must be generated individually.

For Python, Dot, and Flatdata output, all types are emitted monolithically.

Example

schema/
├── types.flatdata
└── main.flatdata

// types.flatdata
namespace geo {
    struct Point {
        x : u32 : 32;
        y : u32 : 32;
    }
}

// main.flatdata
import "types.flatdata";
namespace app {
    archive Locations {
        points : vector< .geo.Point >;
    }
}

Generate each file separately:

flatdata-generator -s schema/types.flatdata -g cpp -O schema/types.h
flatdata-generator -s schema/main.flatdata -g cpp -O schema/main.h

The generated main.h will contain #include "types.h" and only define the app::Locations archive.

Rust Project Setup

Each generated Rust file must live in its own module, with all imported schemas as siblings under a common parent module:

my_crate/
├── build.rs
└── src/
    └── schema/
        ├── mod.rs          // pub mod types; pub mod main_schema;
        ├── types.rs        // include!(concat!(env!("OUT_DIR"), "/schema/types.rs"));
        └── main_schema.rs  // include!(concat!(env!("OUT_DIR"), "/schema/main.rs"));

The generated code uses pub use super::...::module::namespace::*; re-exports to wire imported types through the module hierarchy.