rdf-go

January 24, 2026 · View on GitHub

rdf-go is a small, fast RDF parsing/encoding library with streaming APIs and RDF-star support. It is designed for low allocations and for use in pipelines where RDF data should be processed incrementally.

About

rdf-go is developed by GeoKnoesis LLC, a company specializing in semantic web technologies and knowledge engineering.

Main Developer

Stephane Fellah - Principal Developer
Contact: stephanef@geoknoesis.com
Geosemantic-AI expert with 30 years of experience

Features

Streaming readers (pull style) and writers (push style).
Unified API: single Reader and Writer interfaces for all formats.
Statement type: represents either a triple (G is nil) or a quad (G is non-nil).
Convenience helper: Parse for streaming with handler functions.
RDF-star via TripleTerm values.
Multiple formats: Turtle, TriG, N-Triples, N-Quads, RDF/XML, JSON-LD.
Automatic format detection with FormatAuto.

What Makes This Library Unique

rdf-go stands out as a comprehensive, standards-compliant RDF library for Go with several distinguishing characteristics:

Standards Conformance

Full W3C Standards Support:

RDF 1.1 (W3C Recommendation, 2014): Complete conformance with RDF 1.1 Concepts, RDF 1.1 Turtle, RDF 1.1 N-Triples, RDF 1.1 TriG, and RDF 1.1 N-Quads specifications
RDF 1.2 (W3C Recommendation, 2024): Support for RDF 1.2 features including quoted triples (RDF-star) in Turtle, TriG, N-Triples, and N-Quads formats
JSON-LD 1.1 (W3C Recommendation, 2020): Full support for JSON-LD 1.1 Processing Algorithms, including remote context resolution, expansion, and RDF conversion
Turtle 1.1 & 1.2: Complete Turtle syntax support including collections, blank nodes, prefixes, base IRIs, and RDF-star quoted triples
RDF/XML 1.0: Full RDF/XML support with container membership expansion (rdf:Bag, rdf:Seq, rdf:Alt, rdf:List)

W3C Test Suite Compliance:

Passes official W3C RDF test suites for RDF 1.1 and RDF 1.2
Validated against W3C JSON-LD test suite (both 1.0 and 1.1)
Comprehensive compliance testing via TestW3CConformance() with support for manifest-based test execution

Unique Architecture

Unified API Design:

Single Reader and Writer interface for all 6 supported formats (Turtle, TriG, N-Triples, N-Quads, RDF/XML, JSON-LD)
No format-specific APIs to learn—same code works across all formats
Automatic format detection eliminates the need to specify format explicitly

Streaming-First Architecture:

True streaming parsers with O(1) memory usage (bounded only by security limits)
Pull-style readers for explicit control over parsing flow
Push-style writers for efficient encoding
Designed for processing large datasets that don't fit in memory

RDF-star (Quoted Triples) Support:

Native support for RDF-star quoted triples via TripleTerm type
Works seamlessly across Turtle, TriG, N-Triples, and N-Quads formats
Enables making statements about statements—a key feature for provenance, annotations, and reification

Performance & Efficiency

Optimized for Production:

Low-allocation design using strings.Builder and buffer reuse
Streaming architecture minimizes memory footprint
Typically processes 10K-100K+ triples/second depending on format
Comprehensive benchmark suite for performance regression testing

Security & Limits:

Built-in security limits for untrusted input (max depth, max line bytes, max statement bytes)
OptSafeLimits() for conservative defaults suitable for untrusted data
Structured error codes for programmatic error handling and recovery

Developer Experience

Simple, Intuitive API:

Optional context parameter (defaults to context.Background() when nil)
Convenient Parse() function for common use cases
Clear separation between triples and quads via IsTriple() and IsQuad() methods
Comprehensive error handling with structured error codes

Production-Ready:

Extensive test coverage (70%+)
Performance regression tests ensure consistent performance
Comprehensive error handling documentation
Well-documented with examples for all major use cases

Format Coverage

Complete Format Support:

Triple formats: Turtle, N-Triples, RDF/XML, JSON-LD
Quad formats: TriG, N-Quads, JSON-LD (with named graphs)
All formats support: Parsing ✅, Encoding ✅, Blank Nodes ✅
RDF-star support: Turtle ✅, TriG ✅, N-Triples ✅, N-Quads ✅

Advanced Features:

RDF/XML container membership expansion (rdf:li → rdf:_n)
JSON-LD remote context resolution via DocumentLoader
JSON-LD 1.1 features (processing modes, RDF direction, native types)
Deterministic output for Turtle, TriG, N-Triples, N-Quads, and RDF/XML

Why Choose rdf-go?

Standards Compliance: Full conformance with latest W3C RDF and JSON-LD specifications
Performance: Streaming architecture optimized for high-throughput scenarios
Simplicity: Unified API across all formats reduces learning curve
Completeness: Support for all major RDF formats in a single library
Modern Features: RDF-star support for next-generation RDF applications
Production Quality: Comprehensive testing, error handling, and documentation

Whether you're building semantic web applications, data pipelines, knowledge graphs, or RDF processing tools, rdf-go provides the standards compliance, performance, and developer experience you need.

Install

go get github.com/geoknoesis/rdf-go

Documentation

📚 Full documentation available at: https://geoknoesis.github.io/rdf-go/

The documentation includes:

Getting Started guide
Concepts and API reference
How-to guides and examples
Complete API documentation

Quick Start

Parse with Auto-Detection

The easiest way to parse RDF when you don't know the format. The library automatically detects the format from the input content:

import (
    "context"
    "io"
    "github.com/geoknoesis/rdf-go"
)

// Parse with auto-detection: FormatAuto tells the library to detect the format
// The handler function is called for each statement found in the input
//
// Note: You can pass nil for context to use context.Background() as default.
// Pass an explicit context when you need cancellation or timeouts.
err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
    // Each statement contains S (subject), P (predicate), O (object), and G (graph)
    // For triples, G will be nil. For quads (named graphs), G will be non-nil
    fmt.Printf("Subject: %s, Predicate: %s, Object: %s\n", 
        s.S.String(), s.P.String(), s.O.String())
    
    // Check if this is a quad (has a graph name)
    if s.IsQuad() {
        fmt.Printf("  Graph: %s\n", s.G.String())
    }
    
    // Return nil to continue processing, or an error to stop
    return nil
})

if err != nil {
    // Handle parsing errors (format detection failure, parse errors, etc.)
    log.Fatal(err)
}

Parse vs Reader: When to Use Which?

The library provides two ways to read RDF data: Parse (push model) and NewReader (pull model). Understanding the difference helps you choose the right approach for your use case.

Parse (Push Model) - Recommended for Most Cases

Parse uses a push model where the library calls your handler function for each statement as it's parsed. This is simpler and more convenient for most use cases.

Characteristics:

Push model: Library pushes statements to your handler function
Simpler API: Just provide a handler function
Automatic resource management: No need to manually close readers
Context support: Built-in context cancellation support
Best for: Most common use cases, simple processing, collecting statements

When to use Parse:

Processing all statements sequentially
Collecting statements into a slice or map
Simple filtering or transformation
When you want the simplest API

Example:

// Simplest: pass nil for context (uses context.Background() automatically)
err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
    // Process each statement as it arrives
    return nil
})

Reader (Pull Model) - More Control

NewReader uses a pull model where you explicitly request the next statement by calling Next(). This gives you more control over the parsing process.

Characteristics:

Pull model: You pull statements when ready by calling Next()
More control: You decide when to read the next statement
Manual resource management: Must call Close() when done
Better for complex scenarios: Conditional reading, early termination, custom buffering

When to use Reader:

Need to conditionally skip statements
Want to read a specific number of statements
Need to interleave reading with other I/O operations
Implementing custom buffering or batching
More complex control flow requirements

Example:

dec, err := rdf.NewReader(reader, rdf.FormatTurtle)
defer dec.Close()
for {
    stmt, err := dec.Next()
    if err == io.EOF {
        break
    }
    // Process statement when you're ready
}

Key Differences Summary

Feature	Parse	Reader
Model	Push (handler function)	Pull (Next() method)
API Complexity	Simpler	More explicit
Resource Management	Automatic	Manual (must Close)
Control Flow	Library-driven	Your code controls
Context Support	Built-in	Via options
Best For	Most use cases	Complex scenarios

Recommendation: Start with Parse for simplicity. Use Reader when you need more control over the parsing process.

Understanding Context in Parse

The Parse function requires a context.Context parameter. Here's why and when to use different contexts:

Why Context is Required:

Cancellation: Allows you to cancel parsing mid-stream (useful for user cancellation, shutdown signals)
Timeouts: Enables setting deadlines (useful for preventing long-running operations)
Integration: Works with Go's standard context ecosystem (HTTP requests, gRPC, etc.)

When to use nil (simplest):

Simple parsing without cancellation/timeout needs
Standalone scripts or utilities
When you want parsing to run to completion
Simplest option: Just pass nil instead of context.Background()

When to use context.WithTimeout():

Parsing with a time limit
Preventing parsing from running too long
Example: ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)

When to use context.WithCancel():

User-initiated cancellation
Graceful shutdown scenarios
Example: ctx, cancel := context.WithCancel(context.Background()) then call cancel() to stop

Example with timeout:

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()

err := rdf.Parse(ctx, reader, rdf.FormatAuto, func(s rdf.Statement) error {
    // Process statement
    return nil
})
if err != nil {
    if err == context.DeadlineExceeded {
        // Parsing took too long
    }
}

Example with cancellation:

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

// In another goroutine or signal handler, call cancel() to stop parsing
go func() {
    time.Sleep(5 * time.Second)
    cancel() // Stop parsing after 5 seconds
}()

err := rdf.Parse(ctx, reader, rdf.FormatAuto, func(s rdf.Statement) error {
    // Process statement
    return nil
})
if err != nil {
    if err == context.Canceled {
        // Parsing was cancelled
    }
}

Note: If you don't need cancellation or timeouts, you can simply pass nil for the context parameter. The library will automatically use context.Background() for you. This is the simplest option for most use cases.

Example with nil (simplest):

err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
    // Process statement
    return nil
})

Decode (Pull Style)

For more control over the parsing process, use NewReader with a pull-style API. This gives you explicit control over when to read the next statement:

import (
    "io"
    "github.com/geoknoesis/rdf-go"
)

// Create a reader for Turtle format
// You can also use rdf.FormatAuto to auto-detect the format
dec, err := rdf.NewReader(reader, rdf.FormatTurtle)
if err != nil {
    // Handle initialization errors (unsupported format, etc.)
    log.Fatal(err)
}
// Always close the reader to release resources
defer dec.Close()

// Pull-style: explicitly request the next statement
for {
    stmt, err := dec.Next()
    if err == io.EOF {
        // End of input reached - normal termination
        break
    }
    if err != nil {
        // Handle parsing errors
        log.Printf("Parse error: %v", err)
        return err
    }
    
    // Process the statement
    // Use stmt.IsTriple() to check if it's a triple (G is nil)
    // Use stmt.IsQuad() to check if it's a quad (G is non-nil)
    if stmt.IsTriple() {
        fmt.Printf("Triple: %s %s %s\n", 
            stmt.S.String(), stmt.P.String(), stmt.O.String())
    } else {
        fmt.Printf("Quad: %s %s %s (graph: %s)\n",
            stmt.S.String(), stmt.P.String(), stmt.O.String(), stmt.G.String())
    }
}

Read All Statements

For small datasets, you can collect all statements into a slice using Parse:

import (
    "context"
    "github.com/geoknoesis/rdf-go"
)

// Collect all statements into a slice
var stmts []rdf.Statement
err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
    stmts = append(stmts, s)
    return nil
})
if err != nil {
    // Handle errors (format detection failure, parse errors, etc.)
    log.Fatal(err)
}

// Now you can process all statements
fmt.Printf("Loaded %d statements\n", len(stmts))
for i, stmt := range stmts {
    fmt.Printf("Statement %d: %s %s %s\n", 
        i+1, stmt.S.String(), stmt.P.String(), stmt.O.String())
}

Encode (Push Style)

To write RDF data, use NewWriter with a push-style API. You explicitly write each statement:

import (
    "bytes"
    "github.com/geoknoesis/rdf-go"
)

// Create a buffer to hold the encoded output
buf := &bytes.Buffer{}

// Create an writer for Turtle format
enc, err := rdf.NewWriter(buf, rdf.FormatTurtle)
if err != nil {
    // Handle initialization errors
    log.Fatal(err)
}
// Always close the writer to flush any remaining data
defer enc.Close()

// Create a statement (triple) - there are two ways:

// Option 1: Omit G field (defaults to nil for triples) - more readable!
stmt := rdf.Statement{
    S: rdf.IRI{Value: "http://example.org/s"},  // Subject: the resource
    P: rdf.IRI{Value: "http://example.org/p"},  // Predicate: the property
    O: rdf.IRI{Value: "http://example.org/o"}, // Object: the value
    // G is omitted and defaults to nil (this is a triple, not a quad)
}

// Option 2: Use the convenience function
stmt := rdf.NewTriple(
    rdf.IRI{Value: "http://example.org/s"},
    rdf.IRI{Value: "http://example.org/p"},
    rdf.IRI{Value: "http://example.org/o"},
)

// Write the statement to the writer
if err := enc.Write(stmt); err != nil {
    // Handle write errors
    log.Fatal(err)
}

// Flush any buffered data (important for some formats)
if err := enc.Flush(); err != nil {
    log.Fatal(err)
}

// The encoded RDF is now in buf
fmt.Print(buf.String())
// Output: <http://example.org/s> <http://example.org/p> <http://example.org/o> .

Write Multiple Statements

For writing multiple statements, use NewWriter with a loop:

import (
    "os"
    "github.com/geoknoesis/rdf-go"
)

// Prepare a slice of statements to write
stmts := []rdf.Statement{
    // Create triples - G can be omitted (defaults to nil)
    rdf.Statement{
        S: rdf.IRI{Value: "http://example.org/s1"},
        P: rdf.IRI{Value: "http://example.org/p1"},
        O: rdf.IRI{Value: "http://example.org/o1"},
    },
    // Or use the convenience function
    rdf.NewTriple(
        rdf.IRI{Value: "http://example.org/s2"},
        rdf.IRI{Value: "http://example.org/p2"},
        rdf.IRI{Value: "http://example.org/o2"},
    ),
}

// Write all statements to a file
file, _ := os.Create("output.ttl")
defer file.Close()

writer, err := rdf.NewWriter(file, rdf.FormatTurtle)
if err != nil {
    log.Fatal(err)
}
defer writer.Close()

for _, stmt := range stmts {
    if err := writer.Write(stmt); err != nil {
        log.Fatal(err)
    }
}
if err := writer.Flush(); err != nil {
    log.Fatal(err)
}

RDF-star

RDF-star allows you to make statements about statements using quoted triples. The library represents quoted triples using TripleTerm:

// Create a quoted triple - this represents a statement that can be used as a subject or object
quoted := rdf.TripleTerm{
    S: rdf.IRI{Value: "http://example.org/alice"},  // Subject of the quoted triple
    P: rdf.IRI{Value: "http://example.org/said"},  // Predicate of the quoted triple
    O: rdf.Literal{Lexical: "Hello"},                // Object of the quoted triple
}

// Use the quoted triple as a subject in a new statement
// This says: "The statement 'Alice said Hello' is asserted to be true"
stmt := rdf.Statement{
    S: quoted,  // Subject is the quoted triple (RDF-star feature)
    P: rdf.IRI{Value: "http://example.org/asserted"}, // Predicate: "is asserted"
    O: rdf.Literal{Lexical: "true"},                 // Object: true
    // G omitted - defaults to nil (this is a triple, not a quad)
}

// You can encode this to Turtle format, which supports RDF-star
enc, _ := rdf.NewWriter(&buf, rdf.FormatTurtle)
enc.Write(stmt)
// Output: <<http://example.org/alice http://example.org/said "Hello">> 
//          <http://example.org/asserted> "true" .

IRI Validation

The library provides optional strict IRI validation according to RFC 3987:

import "github.com/geoknoesis/rdf-go"

// Enable strict IRI validation
dec, err := rdf.NewReader(reader, rdf.FormatTurtle, rdf.OptStrictIRIValidation())
if err != nil {
    return err
}
defer dec.Close()

// Or validate IRIs programmatically
iri := "http://example.org/resource"
if err := rdf.ValidateIRI(iri); err != nil {
    // Handle invalid IRI
    return fmt.Errorf("invalid IRI: %w", err)
}

Note: By default, IRI validation is lenient (no validation) for backward compatibility. Format-specific behavior:

N-Triples: Always validates that IRIs have a scheme (absolute IRIs required per spec)
Turtle/TriG: Allows relative IRIs with base resolution; no validation by default
RDF/XML: Allows relative IRIs with base resolution; no validation by default
JSON-LD: No validation by default

Enable OptStrictIRIValidation() for additional RFC 3987 validation across all formats.

Error Handling

The library follows Go's standard error handling patterns. Always check for io.EOF to detect end of input:

import (
    "errors"
    "io"
    "github.com/geoknoesis/rdf-go"
)

dec, err := rdf.NewReader(reader, rdf.FormatTurtle)
if err != nil {
    // Handle initialization errors (unsupported format, etc.)
    return err
}
defer dec.Close()

for {
    stmt, err := dec.Next()
    if err == io.EOF {
        // End of input reached - this is normal, not an error
        break
    }
    if err != nil {
        // Check if it's a parse error with position information
        var parseErr *rdf.ParseError
        if errors.As(err, &parseErr) {
            // ParseError includes detailed position information
            fmt.Printf("Parse error at line %d, column %d: %v\n", 
                parseErr.Line, parseErr.Column, parseErr.Err)
            // The error message also includes input excerpts with caret indicators
        } else {
            // Other errors (I/O errors, context cancellation, etc.)
            fmt.Printf("Error: %v\n", err)
        }
        return err
    }
    
    // Successfully read a statement - process it
    // Use stmt.IsTriple() or stmt.IsQuad() to check the statement type
    processStatement(stmt)
}

Error messages automatically include:

Position information (line:column or offset)
Input excerpts showing context around the error
Caret indicators pointing to the error position

Format Selection

The library uses a unified Format type for all RDF serialization formats. You can either use format constants or parse format strings:

// Option 1: Parse format from a string (useful for user input or file extensions)
format, ok := rdf.ParseFormat("ttl")  // Returns FormatTurtle
if !ok {
    // Format string not recognized
    return fmt.Errorf("unknown format: %s", "ttl")
}

// Option 2: Use format constants directly (recommended for known formats)
dec, err := rdf.NewReader(reader, rdf.FormatTurtle)

// Option 3: Use auto-detection (library detects format from input)
dec, err := rdf.NewReader(reader, rdf.FormatAuto)

Supported Formats

Triple formats:

rdf.FormatTurtle - Turtle (.ttl)
rdf.FormatNTriples - N-Triples (.nt)
rdf.FormatRDFXML - RDF/XML (.rdf, .xml)
rdf.FormatJSONLD - JSON-LD (.jsonld)

Quad formats:

rdf.FormatTriG - TriG (.trig)
rdf.FormatNQuads - N-Quads (.nq)

Auto-detection:

rdf.FormatAuto - Automatically detect format from input

Options

Configure reader/writer behavior using functional options. Options are applied in order and can be combined:

import (
    "context"
    "time"
    "github.com/geoknoesis/rdf-go"
)

// Create a context with timeout for cancellation
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

// Configure reader with multiple options
dec, err := rdf.NewReader(reader, rdf.FormatTurtle,
    // Security: Apply safe limits for untrusted input
    // This sets reasonable defaults to prevent resource exhaustion attacks
    rdf.OptSafeLimits(),
    
    // Limit nesting depth (for collections, blank node lists, etc.)
    rdf.OptMaxDepth(50),
    
    // Set context for cancellation and timeouts
    rdf.OptContext(ctx),
    
    // Limit maximum line size (64KB)
    rdf.OptMaxLineBytes(64<<10),
    
    // Limit maximum statement size (256KB)
    rdf.OptMaxStatementBytes(256<<10),
    
    // Limit total number of statements to process
    rdf.OptMaxTriples(1_000_000),
)
if err != nil {
    return err
}

Available options:

OptContext(ctx) - Set context for cancellation and timeouts
OptMaxLineBytes(n) - Set maximum line size limit
OptMaxStatementBytes(n) - Set maximum statement size limit
OptMaxDepth(n) - Set maximum nesting depth limit
OptMaxTriples(n) - Set maximum number of triples/quads to process
OptSafeLimits() - Apply safe limits suitable for untrusted input
OptStrictIRIValidation() - Enable strict IRI validation according to RFC 3987
OptExpandRDFXMLContainers() - Enable RDF/XML container membership expansion (default: enabled)
OptDisableRDFXMLContainerExpansion() - Disable RDF/XML container membership expansion

Versioning & Compatibility

This library follows Semantic Versioning:

v1.x.x: Backward compatible changes only (new features, bug fixes)
v2.x.x: Breaking changes (if needed in the future)

API Stability

The following APIs are considered stable and will maintain backward compatibility:

Reader and Writer interfaces
Statement, Triple, Quad types
Term interface and implementations (IRI, BlankNode, Literal, TripleTerm)
NewReader(), NewWriter(), Parse() functions
Format constants (FormatTurtle, FormatNTriples, etc.)
Option functions (OptMaxDepth, OptSafeLimits, etc.)

Deprecation Policy

Deprecated APIs will be marked with // Deprecated: comments
Deprecated APIs will be removed in the next major version
At least one minor version will include deprecation warnings before removal

Go Version Support

Minimum Go version: 1.25.5
The library uses pure Go (no CGO dependencies)
Compatible with all Go versions that support the minimum version

Notes

The API is intentionally small and favors streaming. For large inputs, use NewReader or Parse for efficient processing.
All formats work with the unified Reader and Writer interfaces.
The Statement type represents either a triple (G is nil) or a quad (G is non-nil).
Use stmt.IsTriple() or stmt.IsQuad() to check the statement type.
For any unsupported format, NewReader/NewWriter returns rdf.ErrUnsupportedFormat.
RDF/XML container elements (rdf:Bag, rdf:Seq, rdf:Alt, rdf:List) support container membership expansion. By default, rdf:li elements are automatically converted to rdf:_1, rdf:_2, etc. Use OptDisableRDFXMLContainerExpansion() to disable this behavior.

Security and Limits

For Untrusted Input

Always set explicit security limits when processing untrusted input to prevent resource exhaustion attacks.

// Use OptSafeLimits for untrusted input
dec, err := rdf.NewReader(r, rdf.FormatTurtle, rdf.OptSafeLimits())

Or set custom limits:

dec, err := rdf.NewReader(r, rdf.FormatTurtle,
    rdf.OptMaxLineBytes(64<<10),      // 64KB per line
    rdf.OptMaxStatementBytes(256<<10), // 256KB per statement
    rdf.OptMaxDepth(50),              // 50 levels of nesting
    rdf.OptMaxTriples(1_000_000),      // 1M triples max
    rdf.OptContext(ctx),               // For cancellation/timeouts
)

Security Limits

The following limits are available via options:

MaxLineBytes: Maximum size of a single line (default: 1MB)
MaxStatementBytes: Maximum size of a complete statement (default: 4MB)
MaxDepth: Maximum nesting depth for collections, blank node lists, etc. (default: 100)
MaxTriples: Maximum number of triples/quads to process (default: 10M)
Context: Context for cancellation and timeouts

Default limits are suitable for trusted input only. For untrusted input, use SafeDecodeOptions() or set stricter limits.

Error Diagnostics

Errors include line and column information, along with input excerpts for better debugging:

stmt, err := dec.Next()
if err != nil {
    var parseErr *rdf.ParseError
    if errors.As(err, &parseErr) {
        fmt.Printf("Error at line %d, column %d: %v\n", 
            parseErr.Line, parseErr.Column, parseErr.Err)
        // Error messages automatically include input excerpts with caret indicators
        // Example output:
        // turtle:3:15: unexpected token
        //   ex:s ex:p ex:o .
        //            ^
    }
}

Error messages automatically include:

Position information (line:column or offset)
Input excerpts showing context around the error
Caret indicators pointing to the error position

Error Codes

For programmatic error handling, use the Code() function to get error codes:

import "github.com/geoknoesis/rdf-go"

stmt, err := dec.Next()
if err != nil {
    code := rdf.Code(err)
    switch code {
    case rdf.ErrCodeUnsupportedFormat:
        // Handle unsupported format
    case rdf.ErrCodeLineTooLong:
        // Handle line too long
    case rdf.ErrCodeStatementTooLong:
        // Handle statement too long
    case rdf.ErrCodeDepthExceeded:
        // Handle depth exceeded
    case rdf.ErrCodeTripleLimitExceeded:
        // Handle triple limit exceeded
    case rdf.ErrCodeContextCanceled:
        // Handle context cancellation
    case rdf.ErrCodeParseError:
        // Handle general parse error
    default:
        // Handle unknown error
    }
}

Available Error Codes:

ErrCodeUnsupportedFormat - Unsupported RDF format
ErrCodeLineTooLong - Line exceeded configured limit
ErrCodeStatementTooLong - Statement exceeded configured limit
ErrCodeDepthExceeded - Nesting depth exceeded configured limit
ErrCodeTripleLimitExceeded - Maximum number of triples/quads exceeded
ErrCodeParseError - General parse error
ErrCodeContextCanceled - Context was canceled
ErrCodeInvalidIRI - Invalid IRI encountered
ErrCodeInvalidLiteral - Invalid literal encountered

Note: Code() returns an empty string for nil errors and io.EOF (which is not an error condition).

JSON-LD Options

JSON-LD decoding supports additional semantic limits via JSONLDOptions:

jsonldOpts := rdf.JSONLDOptions{
    Context:       ctx,
    MaxInputBytes: 1 << 20,
    MaxNodes:      10000,
    MaxGraphItems: 10000,
    MaxQuads:      20000,
}
// JSON-LD options are used internally when FormatJSONLD is specified
dec, err := rdf.NewReader(r, rdf.FormatJSONLD)

Note: JSON-LD supports streaming parsing using json.Reader token-by-token processing. The reader processes nodes incrementally and emits triples/quads as they are parsed. However, there are some limitations:

When @graph appears before @context in the JSON structure, the graph items must be buffered until the context is available (this is necessary for correct term expansion)
Nested objects and arrays are fully decoded into memory (this is required for JSON-LD context processing and term expansion)
Remote context resolution: The streaming reader supports remote context URLs when a DocumentLoader is provided in JSONLDOptions. If @context is a string URL, it will be loaded via the DocumentLoader before processing.
For very large documents with @graph before @context, consider reordering the JSON structure to place @context first, or use other RDF formats (Turtle, N-Triples, TriG, N-Quads) which have more efficient streaming characteristics

Output Determinism

The library provides deterministic output for most formats, with some format-specific considerations:

Deterministic Formats

Turtle/TriG:

Prefix declarations are sorted alphabetically (deterministic order)
Statement order matches input order
Prefix selection uses longest matching namespace (deterministic algorithm)
Blank node labels are preserved from input

N-Triples/N-Quads:

Output order matches input order exactly
No prefix abbreviations (fully expanded IRIs)
Fully deterministic output

RDF/XML:

XML structure is deterministic
Element order matches input order

Non-Deterministic Formats

JSON-LD:

⚠️ Key ordering is non-deterministic due to Go's map iteration order
JSON object keys may appear in different orders across runs
This is a limitation of Go's encoding/json package
For deterministic JSON-LD output, use the CanonicalizeJSONLD() function:

import "github.com/geoknoesis/rdf-go"

// Encode to JSON-LD
var buf bytes.Buffer
enc, _ := rdf.NewWriter(&buf, rdf.FormatJSONLD)
enc.Write(stmt)
enc.Close()

// Canonicalize for deterministic output
canonical, err := rdf.CanonicalizeJSONLD(buf.Bytes())
if err != nil {
    return err
}
// canonical now contains deterministic JSON-LD

Best Practices

For reproducible builds, use Turtle, TriG, N-Triples, or N-Quads
If JSON-LD determinism is required, post-process with a JSON canonicalizer
Round-trip tests verify semantic equivalence (isomorphic graphs) rather than byte-for-byte equality

Supported Features Matrix

Feature	Turtle	TriG	N-Triples	N-Quads	RDF/XML	JSON-LD
Parsing	✅	✅	✅	✅	✅	✅
Encoding	✅	✅	✅	✅	✅	✅
Named Graphs	❌	✅	❌	✅	❌	✅
Prefixes	✅	✅	❌	❌	✅	✅
Base IRI	✅	✅	❌	❌	✅	✅
Blank Nodes	✅	✅	✅	✅	✅	✅
Collections	✅	✅	❌	❌	✅	✅
RDF-star	✅	✅	✅	✅	❌	❌
Streaming	✅	✅	✅	✅	✅	⚠️*
Deterministic Output	✅	✅	✅	✅	✅	❌

* JSON-LD streaming has limitations: buffers @graph when it appears before @context

Format-Specific Notes

RDF/XML: Container membership expansion is implemented and enabled by default. Use OptDisableRDFXMLContainerExpansion() to disable automatic expansion of rdf:li to rdf:_n.
JSON-LD: Compaction and framing are not supported (I/O library only)
JSON-LD: Remote context resolution supported when DocumentLoader is provided

Performance

The library is optimized for performance with:

Streaming architecture to minimize memory usage
Low allocation patterns using strings.Builder and buffer reuse
Efficient string operations and parsing
Comprehensive benchmarks available in rdf/benchmarks_test.go

Running Benchmarks

Run benchmarks:

go test ./rdf -bench=. -benchmem -run=^$

Benchmark Results

Benchmark results vary by system and input size. Key benchmarks include:

BenchmarkTurtleDecodeLarge - Large Turtle file decoding
BenchmarkNTriplesDecodeLarge - Large N-Triples file decoding
BenchmarkTriGDecode - TriG format decoding
BenchmarkJSONLDDecode - JSON-LD format decoding
BenchmarkTurtleEncode - Turtle encoding
BenchmarkNTriplesEncodeLarge - N-Triples encoding
BenchmarkUnescapeString - String unescaping performance
BenchmarkResolveIRI - IRI resolution performance
BenchmarkFormatDetection - Format detection performance

Performance Characteristics:

Streaming: All formats support streaming with constant memory usage (except JSON-LD edge cases)
Throughput: Typically 10K-100K+ triples/second depending on format and input complexity
Memory: O(1) memory usage for streaming parsers (bounded by security limits)
Allocations: Optimized to minimize allocations using buffer reuse and strings.Builder

For detailed performance analysis, run benchmarks on your target system and input data.

💝 Support & Sponsorship

rdf-go is open-source and free to use, but maintaining and evolving it requires ongoing effort.

If rdf-go is valuable to you or your organization, your financial support helps ensure:

✅ Continued maintenance - Bug fixes, security updates, and compatibility with new Go versions and RDF standards
✅ Feature development - New capabilities and improvements based on community needs
✅ Custom adaptations - Priority consideration for features that align with your specific requirements
✅ Long-term sustainability - Keeping the project active and well-maintained for the community

Ways to support:

💰 GitHub Sponsors - Monthly or one-time sponsorship
☕ Ko-fi - One-time donations
🏢 Enterprise Support - For organizations needing priority support, custom features, or commercial licensing: stephanef@geoknoesis.com
🌟 Star the repository - Help others discover rdf-go on GitHub