rdf-go
January 24, 2026 · View on GitHub
rdf-go is a small, fast RDF parsing/encoding library with streaming
APIs and RDF-star support. It is designed for low allocations and for use
in pipelines where RDF data should be processed incrementally.
About
rdf-go is developed by GeoKnoesis LLC, a company specializing in semantic web technologies and knowledge engineering.
Main Developer
- Stephane Fellah - Principal Developer
- Contact: stephanef@geoknoesis.com
- Geosemantic-AI expert with 30 years of experience
Features
- Streaming readers (pull style) and writers (push style).
- Unified API: single
ReaderandWriterinterfaces for all formats. Statementtype: represents either a triple (G is nil) or a quad (G is non-nil).- Convenience helper:
Parsefor streaming with handler functions. - RDF-star via
TripleTermvalues. - Multiple formats: Turtle, TriG, N-Triples, N-Quads, RDF/XML, JSON-LD.
- Automatic format detection with
FormatAuto.
What Makes This Library Unique
rdf-go stands out as a comprehensive, standards-compliant RDF library for Go with several distinguishing characteristics:
Standards Conformance
Full W3C Standards Support:
- RDF 1.1 (W3C Recommendation, 2014): Complete conformance with RDF 1.1 Concepts, RDF 1.1 Turtle, RDF 1.1 N-Triples, RDF 1.1 TriG, and RDF 1.1 N-Quads specifications
- RDF 1.2 (W3C Recommendation, 2024): Support for RDF 1.2 features including quoted triples (RDF-star) in Turtle, TriG, N-Triples, and N-Quads formats
- JSON-LD 1.1 (W3C Recommendation, 2020): Full support for JSON-LD 1.1 Processing Algorithms, including remote context resolution, expansion, and RDF conversion
- Turtle 1.1 & 1.2: Complete Turtle syntax support including collections, blank nodes, prefixes, base IRIs, and RDF-star quoted triples
- RDF/XML 1.0: Full RDF/XML support with container membership expansion (rdf:Bag, rdf:Seq, rdf:Alt, rdf:List)
W3C Test Suite Compliance:
- Passes official W3C RDF test suites for RDF 1.1 and RDF 1.2
- Validated against W3C JSON-LD test suite (both 1.0 and 1.1)
- Comprehensive compliance testing via
TestW3CConformance()with support for manifest-based test execution
Unique Architecture
Unified API Design:
- Single
ReaderandWriterinterface for all 6 supported formats (Turtle, TriG, N-Triples, N-Quads, RDF/XML, JSON-LD) - No format-specific APIs to learn—same code works across all formats
- Automatic format detection eliminates the need to specify format explicitly
Streaming-First Architecture:
- True streaming parsers with O(1) memory usage (bounded only by security limits)
- Pull-style readers for explicit control over parsing flow
- Push-style writers for efficient encoding
- Designed for processing large datasets that don't fit in memory
RDF-star (Quoted Triples) Support:
- Native support for RDF-star quoted triples via
TripleTermtype - Works seamlessly across Turtle, TriG, N-Triples, and N-Quads formats
- Enables making statements about statements—a key feature for provenance, annotations, and reification
Performance & Efficiency
Optimized for Production:
- Low-allocation design using
strings.Builderand buffer reuse - Streaming architecture minimizes memory footprint
- Typically processes 10K-100K+ triples/second depending on format
- Comprehensive benchmark suite for performance regression testing
Security & Limits:
- Built-in security limits for untrusted input (max depth, max line bytes, max statement bytes)
OptSafeLimits()for conservative defaults suitable for untrusted data- Structured error codes for programmatic error handling and recovery
Developer Experience
Simple, Intuitive API:
- Optional context parameter (defaults to
context.Background()whennil) - Convenient
Parse()function for common use cases - Clear separation between triples and quads via
IsTriple()andIsQuad()methods - Comprehensive error handling with structured error codes
Production-Ready:
- Extensive test coverage (70%+)
- Performance regression tests ensure consistent performance
- Comprehensive error handling documentation
- Well-documented with examples for all major use cases
Format Coverage
Complete Format Support:
- Triple formats: Turtle, N-Triples, RDF/XML, JSON-LD
- Quad formats: TriG, N-Quads, JSON-LD (with named graphs)
- All formats support: Parsing ✅, Encoding ✅, Blank Nodes ✅
- RDF-star support: Turtle ✅, TriG ✅, N-Triples ✅, N-Quads ✅
Advanced Features:
- RDF/XML container membership expansion (rdf:li → rdf:_n)
- JSON-LD remote context resolution via
DocumentLoader - JSON-LD 1.1 features (processing modes, RDF direction, native types)
- Deterministic output for Turtle, TriG, N-Triples, N-Quads, and RDF/XML
Why Choose rdf-go?
- Standards Compliance: Full conformance with latest W3C RDF and JSON-LD specifications
- Performance: Streaming architecture optimized for high-throughput scenarios
- Simplicity: Unified API across all formats reduces learning curve
- Completeness: Support for all major RDF formats in a single library
- Modern Features: RDF-star support for next-generation RDF applications
- Production Quality: Comprehensive testing, error handling, and documentation
Whether you're building semantic web applications, data pipelines, knowledge graphs, or RDF processing tools, rdf-go provides the standards compliance, performance, and developer experience you need.
Install
go get github.com/geoknoesis/rdf-go
Documentation
📚 Full documentation available at: https://geoknoesis.github.io/rdf-go/
The documentation includes:
- Getting Started guide
- Concepts and API reference
- How-to guides and examples
- Complete API documentation
Quick Start
Parse with Auto-Detection
The easiest way to parse RDF when you don't know the format. The library automatically detects the format from the input content:
import (
"context"
"io"
"github.com/geoknoesis/rdf-go"
)
// Parse with auto-detection: FormatAuto tells the library to detect the format
// The handler function is called for each statement found in the input
//
// Note: You can pass nil for context to use context.Background() as default.
// Pass an explicit context when you need cancellation or timeouts.
err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
// Each statement contains S (subject), P (predicate), O (object), and G (graph)
// For triples, G will be nil. For quads (named graphs), G will be non-nil
fmt.Printf("Subject: %s, Predicate: %s, Object: %s\n",
s.S.String(), s.P.String(), s.O.String())
// Check if this is a quad (has a graph name)
if s.IsQuad() {
fmt.Printf(" Graph: %s\n", s.G.String())
}
// Return nil to continue processing, or an error to stop
return nil
})
if err != nil {
// Handle parsing errors (format detection failure, parse errors, etc.)
log.Fatal(err)
}
Parse vs Reader: When to Use Which?
The library provides two ways to read RDF data: Parse (push model) and NewReader (pull model). Understanding the difference helps you choose the right approach for your use case.
Parse (Push Model) - Recommended for Most Cases
Parse uses a push model where the library calls your handler function for each statement as it's parsed. This is simpler and more convenient for most use cases.
Characteristics:
- Push model: Library pushes statements to your handler function
- Simpler API: Just provide a handler function
- Automatic resource management: No need to manually close readers
- Context support: Built-in context cancellation support
- Best for: Most common use cases, simple processing, collecting statements
When to use Parse:
- Processing all statements sequentially
- Collecting statements into a slice or map
- Simple filtering or transformation
- When you want the simplest API
Example:
// Simplest: pass nil for context (uses context.Background() automatically)
err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
// Process each statement as it arrives
return nil
})
Reader (Pull Model) - More Control
NewReader uses a pull model where you explicitly request the next statement by calling Next(). This gives you more control over the parsing process.
Characteristics:
- Pull model: You pull statements when ready by calling
Next() - More control: You decide when to read the next statement
- Manual resource management: Must call
Close()when done - Better for complex scenarios: Conditional reading, early termination, custom buffering
When to use Reader:
- Need to conditionally skip statements
- Want to read a specific number of statements
- Need to interleave reading with other I/O operations
- Implementing custom buffering or batching
- More complex control flow requirements
Example:
dec, err := rdf.NewReader(reader, rdf.FormatTurtle)
defer dec.Close()
for {
stmt, err := dec.Next()
if err == io.EOF {
break
}
// Process statement when you're ready
}
Key Differences Summary
| Feature | Parse | Reader |
|---|---|---|
| Model | Push (handler function) | Pull (Next() method) |
| API Complexity | Simpler | More explicit |
| Resource Management | Automatic | Manual (must Close) |
| Control Flow | Library-driven | Your code controls |
| Context Support | Built-in | Via options |
| Best For | Most use cases | Complex scenarios |
Recommendation: Start with Parse for simplicity. Use Reader when you need more control over the parsing process.
Understanding Context in Parse
The Parse function requires a context.Context parameter. Here's why and when to use different contexts:
Why Context is Required:
- Cancellation: Allows you to cancel parsing mid-stream (useful for user cancellation, shutdown signals)
- Timeouts: Enables setting deadlines (useful for preventing long-running operations)
- Integration: Works with Go's standard context ecosystem (HTTP requests, gRPC, etc.)
When to use nil (simplest):
- Simple parsing without cancellation/timeout needs
- Standalone scripts or utilities
- When you want parsing to run to completion
- Simplest option: Just pass
nilinstead ofcontext.Background()
When to use context.WithTimeout():
- Parsing with a time limit
- Preventing parsing from running too long
- Example:
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
When to use context.WithCancel():
- User-initiated cancellation
- Graceful shutdown scenarios
- Example:
ctx, cancel := context.WithCancel(context.Background())then callcancel()to stop
Example with timeout:
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
err := rdf.Parse(ctx, reader, rdf.FormatAuto, func(s rdf.Statement) error {
// Process statement
return nil
})
if err != nil {
if err == context.DeadlineExceeded {
// Parsing took too long
}
}
Example with cancellation:
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
// In another goroutine or signal handler, call cancel() to stop parsing
go func() {
time.Sleep(5 * time.Second)
cancel() // Stop parsing after 5 seconds
}()
err := rdf.Parse(ctx, reader, rdf.FormatAuto, func(s rdf.Statement) error {
// Process statement
return nil
})
if err != nil {
if err == context.Canceled {
// Parsing was cancelled
}
}
Note: If you don't need cancellation or timeouts, you can simply pass nil for the context parameter. The library will automatically use context.Background() for you. This is the simplest option for most use cases.
Example with nil (simplest):
err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
// Process statement
return nil
})
Decode (Pull Style)
For more control over the parsing process, use NewReader with a pull-style API. This gives you explicit control over when to read the next statement:
import (
"io"
"github.com/geoknoesis/rdf-go"
)
// Create a reader for Turtle format
// You can also use rdf.FormatAuto to auto-detect the format
dec, err := rdf.NewReader(reader, rdf.FormatTurtle)
if err != nil {
// Handle initialization errors (unsupported format, etc.)
log.Fatal(err)
}
// Always close the reader to release resources
defer dec.Close()
// Pull-style: explicitly request the next statement
for {
stmt, err := dec.Next()
if err == io.EOF {
// End of input reached - normal termination
break
}
if err != nil {
// Handle parsing errors
log.Printf("Parse error: %v", err)
return err
}
// Process the statement
// Use stmt.IsTriple() to check if it's a triple (G is nil)
// Use stmt.IsQuad() to check if it's a quad (G is non-nil)
if stmt.IsTriple() {
fmt.Printf("Triple: %s %s %s\n",
stmt.S.String(), stmt.P.String(), stmt.O.String())
} else {
fmt.Printf("Quad: %s %s %s (graph: %s)\n",
stmt.S.String(), stmt.P.String(), stmt.O.String(), stmt.G.String())
}
}
Read All Statements
For small datasets, you can collect all statements into a slice using Parse:
import (
"context"
"github.com/geoknoesis/rdf-go"
)
// Collect all statements into a slice
var stmts []rdf.Statement
err := rdf.Parse(nil, reader, rdf.FormatAuto, func(s rdf.Statement) error {
stmts = append(stmts, s)
return nil
})
if err != nil {
// Handle errors (format detection failure, parse errors, etc.)
log.Fatal(err)
}
// Now you can process all statements
fmt.Printf("Loaded %d statements\n", len(stmts))
for i, stmt := range stmts {
fmt.Printf("Statement %d: %s %s %s\n",
i+1, stmt.S.String(), stmt.P.String(), stmt.O.String())
}
Encode (Push Style)
To write RDF data, use NewWriter with a push-style API. You explicitly write each statement:
import (
"bytes"
"github.com/geoknoesis/rdf-go"
)
// Create a buffer to hold the encoded output
buf := &bytes.Buffer{}
// Create an writer for Turtle format
enc, err := rdf.NewWriter(buf, rdf.FormatTurtle)
if err != nil {
// Handle initialization errors
log.Fatal(err)
}
// Always close the writer to flush any remaining data
defer enc.Close()
// Create a statement (triple) - there are two ways:
// Option 1: Omit G field (defaults to nil for triples) - more readable!
stmt := rdf.Statement{
S: rdf.IRI{Value: "http://example.org/s"}, // Subject: the resource
P: rdf.IRI{Value: "http://example.org/p"}, // Predicate: the property
O: rdf.IRI{Value: "http://example.org/o"}, // Object: the value
// G is omitted and defaults to nil (this is a triple, not a quad)
}
// Option 2: Use the convenience function
stmt := rdf.NewTriple(
rdf.IRI{Value: "http://example.org/s"},
rdf.IRI{Value: "http://example.org/p"},
rdf.IRI{Value: "http://example.org/o"},
)
// Write the statement to the writer
if err := enc.Write(stmt); err != nil {
// Handle write errors
log.Fatal(err)
}
// Flush any buffered data (important for some formats)
if err := enc.Flush(); err != nil {
log.Fatal(err)
}
// The encoded RDF is now in buf
fmt.Print(buf.String())
// Output: <http://example.org/s> <http://example.org/p> <http://example.org/o> .
Write Multiple Statements
For writing multiple statements, use NewWriter with a loop:
import (
"os"
"github.com/geoknoesis/rdf-go"
)
// Prepare a slice of statements to write
stmts := []rdf.Statement{
// Create triples - G can be omitted (defaults to nil)
rdf.Statement{
S: rdf.IRI{Value: "http://example.org/s1"},
P: rdf.IRI{Value: "http://example.org/p1"},
O: rdf.IRI{Value: "http://example.org/o1"},
},
// Or use the convenience function
rdf.NewTriple(
rdf.IRI{Value: "http://example.org/s2"},
rdf.IRI{Value: "http://example.org/p2"},
rdf.IRI{Value: "http://example.org/o2"},
),
}
// Write all statements to a file
file, _ := os.Create("output.ttl")
defer file.Close()
writer, err := rdf.NewWriter(file, rdf.FormatTurtle)
if err != nil {
log.Fatal(err)
}
defer writer.Close()
for _, stmt := range stmts {
if err := writer.Write(stmt); err != nil {
log.Fatal(err)
}
}
if err := writer.Flush(); err != nil {
log.Fatal(err)
}
RDF-star
RDF-star allows you to make statements about statements using quoted triples. The library represents quoted triples using TripleTerm:
// Create a quoted triple - this represents a statement that can be used as a subject or object
quoted := rdf.TripleTerm{
S: rdf.IRI{Value: "http://example.org/alice"}, // Subject of the quoted triple
P: rdf.IRI{Value: "http://example.org/said"}, // Predicate of the quoted triple
O: rdf.Literal{Lexical: "Hello"}, // Object of the quoted triple
}
// Use the quoted triple as a subject in a new statement
// This says: "The statement 'Alice said Hello' is asserted to be true"
stmt := rdf.Statement{
S: quoted, // Subject is the quoted triple (RDF-star feature)
P: rdf.IRI{Value: "http://example.org/asserted"}, // Predicate: "is asserted"
O: rdf.Literal{Lexical: "true"}, // Object: true
// G omitted - defaults to nil (this is a triple, not a quad)
}
// You can encode this to Turtle format, which supports RDF-star
enc, _ := rdf.NewWriter(&buf, rdf.FormatTurtle)
enc.Write(stmt)
// Output: <<http://example.org/alice http://example.org/said "Hello">>
// <http://example.org/asserted> "true" .
IRI Validation
The library provides optional strict IRI validation according to RFC 3987:
import "github.com/geoknoesis/rdf-go"
// Enable strict IRI validation
dec, err := rdf.NewReader(reader, rdf.FormatTurtle, rdf.OptStrictIRIValidation())
if err != nil {
return err
}
defer dec.Close()
// Or validate IRIs programmatically
iri := "http://example.org/resource"
if err := rdf.ValidateIRI(iri); err != nil {
// Handle invalid IRI
return fmt.Errorf("invalid IRI: %w", err)
}
Note: By default, IRI validation is lenient (no validation) for backward compatibility. Format-specific behavior:
- N-Triples: Always validates that IRIs have a scheme (absolute IRIs required per spec)
- Turtle/TriG: Allows relative IRIs with base resolution; no validation by default
- RDF/XML: Allows relative IRIs with base resolution; no validation by default
- JSON-LD: No validation by default
Enable OptStrictIRIValidation() for additional RFC 3987 validation across all formats.
Error Handling
The library follows Go's standard error handling patterns. Always check for io.EOF to detect end of input:
import (
"errors"
"io"
"github.com/geoknoesis/rdf-go"
)
dec, err := rdf.NewReader(reader, rdf.FormatTurtle)
if err != nil {
// Handle initialization errors (unsupported format, etc.)
return err
}
defer dec.Close()
for {
stmt, err := dec.Next()
if err == io.EOF {
// End of input reached - this is normal, not an error
break
}
if err != nil {
// Check if it's a parse error with position information
var parseErr *rdf.ParseError
if errors.As(err, &parseErr) {
// ParseError includes detailed position information
fmt.Printf("Parse error at line %d, column %d: %v\n",
parseErr.Line, parseErr.Column, parseErr.Err)
// The error message also includes input excerpts with caret indicators
} else {
// Other errors (I/O errors, context cancellation, etc.)
fmt.Printf("Error: %v\n", err)
}
return err
}
// Successfully read a statement - process it
// Use stmt.IsTriple() or stmt.IsQuad() to check the statement type
processStatement(stmt)
}
Error messages automatically include:
- Position information (line:column or offset)
- Input excerpts showing context around the error
- Caret indicators pointing to the error position
Format Selection
The library uses a unified Format type for all RDF serialization formats. You can either use format constants or parse format strings:
// Option 1: Parse format from a string (useful for user input or file extensions)
format, ok := rdf.ParseFormat("ttl") // Returns FormatTurtle
if !ok {
// Format string not recognized
return fmt.Errorf("unknown format: %s", "ttl")
}
// Option 2: Use format constants directly (recommended for known formats)
dec, err := rdf.NewReader(reader, rdf.FormatTurtle)
// Option 3: Use auto-detection (library detects format from input)
dec, err := rdf.NewReader(reader, rdf.FormatAuto)
Supported Formats
Triple formats:
rdf.FormatTurtle- Turtle (.ttl)rdf.FormatNTriples- N-Triples (.nt)rdf.FormatRDFXML- RDF/XML (.rdf, .xml)rdf.FormatJSONLD- JSON-LD (.jsonld)
Quad formats:
rdf.FormatTriG- TriG (.trig)rdf.FormatNQuads- N-Quads (.nq)
Auto-detection:
rdf.FormatAuto- Automatically detect format from input
Options
Configure reader/writer behavior using functional options. Options are applied in order and can be combined:
import (
"context"
"time"
"github.com/geoknoesis/rdf-go"
)
// Create a context with timeout for cancellation
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Configure reader with multiple options
dec, err := rdf.NewReader(reader, rdf.FormatTurtle,
// Security: Apply safe limits for untrusted input
// This sets reasonable defaults to prevent resource exhaustion attacks
rdf.OptSafeLimits(),
// Limit nesting depth (for collections, blank node lists, etc.)
rdf.OptMaxDepth(50),
// Set context for cancellation and timeouts
rdf.OptContext(ctx),
// Limit maximum line size (64KB)
rdf.OptMaxLineBytes(64<<10),
// Limit maximum statement size (256KB)
rdf.OptMaxStatementBytes(256<<10),
// Limit total number of statements to process
rdf.OptMaxTriples(1_000_000),
)
if err != nil {
return err
}
Available options:
OptContext(ctx)- Set context for cancellation and timeoutsOptMaxLineBytes(n)- Set maximum line size limitOptMaxStatementBytes(n)- Set maximum statement size limitOptMaxDepth(n)- Set maximum nesting depth limitOptMaxTriples(n)- Set maximum number of triples/quads to processOptSafeLimits()- Apply safe limits suitable for untrusted inputOptStrictIRIValidation()- Enable strict IRI validation according to RFC 3987OptExpandRDFXMLContainers()- Enable RDF/XML container membership expansion (default: enabled)OptDisableRDFXMLContainerExpansion()- Disable RDF/XML container membership expansion
Versioning & Compatibility
This library follows Semantic Versioning:
- v1.x.x: Backward compatible changes only (new features, bug fixes)
- v2.x.x: Breaking changes (if needed in the future)
API Stability
The following APIs are considered stable and will maintain backward compatibility:
ReaderandWriterinterfacesStatement,Triple,QuadtypesTerminterface and implementations (IRI,BlankNode,Literal,TripleTerm)NewReader(),NewWriter(),Parse()functions- Format constants (
FormatTurtle,FormatNTriples, etc.) - Option functions (
OptMaxDepth,OptSafeLimits, etc.)
Deprecation Policy
- Deprecated APIs will be marked with
// Deprecated:comments - Deprecated APIs will be removed in the next major version
- At least one minor version will include deprecation warnings before removal
Go Version Support
- Minimum Go version: 1.25.5
- The library uses pure Go (no CGO dependencies)
- Compatible with all Go versions that support the minimum version
Notes
- The API is intentionally small and favors streaming. For large inputs,
use
NewReaderorParsefor efficient processing. - All formats work with the unified
ReaderandWriterinterfaces. - The
Statementtype represents either a triple (G is nil) or a quad (G is non-nil). - Use
stmt.IsTriple()orstmt.IsQuad()to check the statement type. - For any unsupported format,
NewReader/NewWriterreturnsrdf.ErrUnsupportedFormat. - RDF/XML container elements (rdf:Bag, rdf:Seq, rdf:Alt, rdf:List) support container membership expansion.
By default,
rdf:lielements are automatically converted tordf:_1,rdf:_2, etc. UseOptDisableRDFXMLContainerExpansion()to disable this behavior.
Security and Limits
For Untrusted Input
Always set explicit security limits when processing untrusted input to prevent resource exhaustion attacks.
// Use OptSafeLimits for untrusted input
dec, err := rdf.NewReader(r, rdf.FormatTurtle, rdf.OptSafeLimits())
Or set custom limits:
dec, err := rdf.NewReader(r, rdf.FormatTurtle,
rdf.OptMaxLineBytes(64<<10), // 64KB per line
rdf.OptMaxStatementBytes(256<<10), // 256KB per statement
rdf.OptMaxDepth(50), // 50 levels of nesting
rdf.OptMaxTriples(1_000_000), // 1M triples max
rdf.OptContext(ctx), // For cancellation/timeouts
)
Security Limits
The following limits are available via options:
- MaxLineBytes: Maximum size of a single line (default: 1MB)
- MaxStatementBytes: Maximum size of a complete statement (default: 4MB)
- MaxDepth: Maximum nesting depth for collections, blank node lists, etc. (default: 100)
- MaxTriples: Maximum number of triples/quads to process (default: 10M)
- Context: Context for cancellation and timeouts
Default limits are suitable for trusted input only. For untrusted input, use SafeDecodeOptions() or set stricter limits.
Error Diagnostics
Errors include line and column information, along with input excerpts for better debugging:
stmt, err := dec.Next()
if err != nil {
var parseErr *rdf.ParseError
if errors.As(err, &parseErr) {
fmt.Printf("Error at line %d, column %d: %v\n",
parseErr.Line, parseErr.Column, parseErr.Err)
// Error messages automatically include input excerpts with caret indicators
// Example output:
// turtle:3:15: unexpected token
// ex:s ex:p ex:o .
// ^
}
}
Error messages automatically include:
- Position information (line:column or offset)
- Input excerpts showing context around the error
- Caret indicators pointing to the error position
Error Codes
For programmatic error handling, use the Code() function to get error codes:
import "github.com/geoknoesis/rdf-go"
stmt, err := dec.Next()
if err != nil {
code := rdf.Code(err)
switch code {
case rdf.ErrCodeUnsupportedFormat:
// Handle unsupported format
case rdf.ErrCodeLineTooLong:
// Handle line too long
case rdf.ErrCodeStatementTooLong:
// Handle statement too long
case rdf.ErrCodeDepthExceeded:
// Handle depth exceeded
case rdf.ErrCodeTripleLimitExceeded:
// Handle triple limit exceeded
case rdf.ErrCodeContextCanceled:
// Handle context cancellation
case rdf.ErrCodeParseError:
// Handle general parse error
default:
// Handle unknown error
}
}
Available Error Codes:
ErrCodeUnsupportedFormat- Unsupported RDF formatErrCodeLineTooLong- Line exceeded configured limitErrCodeStatementTooLong- Statement exceeded configured limitErrCodeDepthExceeded- Nesting depth exceeded configured limitErrCodeTripleLimitExceeded- Maximum number of triples/quads exceededErrCodeParseError- General parse errorErrCodeContextCanceled- Context was canceledErrCodeInvalidIRI- Invalid IRI encounteredErrCodeInvalidLiteral- Invalid literal encountered
Note: Code() returns an empty string for nil errors and io.EOF (which is not an error condition).
JSON-LD Options
JSON-LD decoding supports additional semantic limits via JSONLDOptions:
jsonldOpts := rdf.JSONLDOptions{
Context: ctx,
MaxInputBytes: 1 << 20,
MaxNodes: 10000,
MaxGraphItems: 10000,
MaxQuads: 20000,
}
// JSON-LD options are used internally when FormatJSONLD is specified
dec, err := rdf.NewReader(r, rdf.FormatJSONLD)
Note: JSON-LD supports streaming parsing using json.Reader token-by-token processing. The reader processes nodes incrementally and emits triples/quads as they are parsed. However, there are some limitations:
- When
@graphappears before@contextin the JSON structure, the graph items must be buffered until the context is available (this is necessary for correct term expansion) - Nested objects and arrays are fully decoded into memory (this is required for JSON-LD context processing and term expansion)
- Remote context resolution: The streaming reader supports remote context URLs when a
DocumentLoaderis provided inJSONLDOptions. If@contextis a string URL, it will be loaded via theDocumentLoaderbefore processing. - For very large documents with
@graphbefore@context, consider reordering the JSON structure to place@contextfirst, or use other RDF formats (Turtle, N-Triples, TriG, N-Quads) which have more efficient streaming characteristics
Output Determinism
The library provides deterministic output for most formats, with some format-specific considerations:
Deterministic Formats
Turtle/TriG:
- Prefix declarations are sorted alphabetically (deterministic order)
- Statement order matches input order
- Prefix selection uses longest matching namespace (deterministic algorithm)
- Blank node labels are preserved from input
N-Triples/N-Quads:
- Output order matches input order exactly
- No prefix abbreviations (fully expanded IRIs)
- Fully deterministic output
RDF/XML:
- XML structure is deterministic
- Element order matches input order
Non-Deterministic Formats
JSON-LD:
- ⚠️ Key ordering is non-deterministic due to Go's map iteration order
- JSON object keys may appear in different orders across runs
- This is a limitation of Go's
encoding/jsonpackage - For deterministic JSON-LD output, use the
CanonicalizeJSONLD()function:
import "github.com/geoknoesis/rdf-go"
// Encode to JSON-LD
var buf bytes.Buffer
enc, _ := rdf.NewWriter(&buf, rdf.FormatJSONLD)
enc.Write(stmt)
enc.Close()
// Canonicalize for deterministic output
canonical, err := rdf.CanonicalizeJSONLD(buf.Bytes())
if err != nil {
return err
}
// canonical now contains deterministic JSON-LD
Best Practices
- For reproducible builds, use Turtle, TriG, N-Triples, or N-Quads
- If JSON-LD determinism is required, post-process with a JSON canonicalizer
- Round-trip tests verify semantic equivalence (isomorphic graphs) rather than byte-for-byte equality
Supported Features Matrix
| Feature | Turtle | TriG | N-Triples | N-Quads | RDF/XML | JSON-LD |
|---|---|---|---|---|---|---|
| Parsing | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Encoding | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Named Graphs | ❌ | ✅ | ❌ | ✅ | ❌ | ✅ |
| Prefixes | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ |
| Base IRI | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ |
| Blank Nodes | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Collections | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ |
| RDF-star | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
| Streaming | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠️* |
| Deterministic Output | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
* JSON-LD streaming has limitations: buffers @graph when it appears before @context
Format-Specific Notes
- RDF/XML: Container membership expansion is implemented and enabled by default.
Use
OptDisableRDFXMLContainerExpansion()to disable automatic expansion ofrdf:litordf:_n. - JSON-LD: Compaction and framing are not supported (I/O library only)
- JSON-LD: Remote context resolution supported when
DocumentLoaderis provided
Performance
The library is optimized for performance with:
- Streaming architecture to minimize memory usage
- Low allocation patterns using
strings.Builderand buffer reuse - Efficient string operations and parsing
- Comprehensive benchmarks available in
rdf/benchmarks_test.go
Running Benchmarks
Run benchmarks:
go test ./rdf -bench=. -benchmem -run=^$
Benchmark Results
Benchmark results vary by system and input size. Key benchmarks include:
BenchmarkTurtleDecodeLarge- Large Turtle file decodingBenchmarkNTriplesDecodeLarge- Large N-Triples file decodingBenchmarkTriGDecode- TriG format decodingBenchmarkJSONLDDecode- JSON-LD format decodingBenchmarkTurtleEncode- Turtle encodingBenchmarkNTriplesEncodeLarge- N-Triples encodingBenchmarkUnescapeString- String unescaping performanceBenchmarkResolveIRI- IRI resolution performanceBenchmarkFormatDetection- Format detection performance
Performance Characteristics:
- Streaming: All formats support streaming with constant memory usage (except JSON-LD edge cases)
- Throughput: Typically 10K-100K+ triples/second depending on format and input complexity
- Memory: O(1) memory usage for streaming parsers (bounded by security limits)
- Allocations: Optimized to minimize allocations using buffer reuse and
strings.Builder
For detailed performance analysis, run benchmarks on your target system and input data.
💝 Support & Sponsorship
rdf-go is open-source and free to use, but maintaining and evolving it requires ongoing effort.
If rdf-go is valuable to you or your organization, your financial support helps ensure:
- ✅ Continued maintenance - Bug fixes, security updates, and compatibility with new Go versions and RDF standards
- ✅ Feature development - New capabilities and improvements based on community needs
- ✅ Custom adaptations - Priority consideration for features that align with your specific requirements
- ✅ Long-term sustainability - Keeping the project active and well-maintained for the community
Ways to support:
- 💰 GitHub Sponsors - Monthly or one-time sponsorship
- ☕ Ko-fi - One-time donations
- 🏢 Enterprise Support - For organizations needing priority support, custom features, or commercial licensing: stephanef@geoknoesis.com
- 🌟 Star the repository - Help others discover rdf-go on GitHub