gcf-rust

June 23, 2026 ยท View on GitHub

Blackwell Systems CI License crates.io

gcf-rust

Rust implementation of GCF -- the most token-efficient wire format for LLMs. A drop-in alternative to JSON and TOON for any structured data.

100% comprehension on every frontier model tested. 29% fewer tokens than TOON, 56% fewer than JSON across 16 datasets. 91.2% on structurally complex code graphs (vs TOON 68.2%, JSON 53.4%). 2,400+ LLM evaluations. Zero training.

Docs: gcformat.com | Playground | GCF vs TOON

Install

[dependencies]
gcf = "0.1"

Zero-copy where possible. Minimal dependencies (serde, serde_json). Don't want to change code? Use the MCP proxy for zero-code adoption.

Quick Start

use gcf::encode_generic;
use serde_json::json;

let data = json!({
    "employees": [
        {"id": 1, "name": "Alice", "department": "Engineering", "salary": 95000},
        {"id": 2, "name": "Bob", "department": "Sales", "salary": 72000},
    ],
});
let output = encode_generic(&data);

Output:

## employees [2]{department,id,name,salary}
Engineering|1|Alice|95000
Sales|2|Bob|72000

Works on any serde_json::Value. One header declares field names, rows are positional values.

Graph Profile

For code graph data with symbols, edges, and distance groups:

use gcf::{Payload, Symbol, Edge, encode};

let p = Payload {
    tool: "context_for_task".into(), token_budget: 5000, tokens_used: 1847,
    symbols: vec![
        Symbol { qualified_name: "pkg.Auth".into(), kind: "function".into(), score: 0.78, provenance: "lsp".into(), distance: 0, ..Default::default() },
        Symbol { qualified_name: "pkg.Server".into(), kind: "function".into(), score: 0.54, provenance: "lsp".into(), distance: 1, ..Default::default() },
    ],
    edges: vec![Edge { source: "pkg.Server".into(), target: "pkg.Auth".into(), edge_type: "calls".into(), ..Default::default() }],
    ..Default::default()
};
let output = encode(&p);

Output:

GCF tool=context_for_task budget=5000 tokens=1847 symbols=2 edges=1
## targets
@0 fn pkg.Auth 0.78 lsp
## related
@1 fn pkg.Server 0.54 lsp
## edges [1]
@0<@1 calls

Decode

use gcf::decode;

let p = decode(input).expect("valid GCF");
println!("{} {} symbols {} edges", p.tool, p.symbols.len(), p.edges.len());

Session Deduplication

Track transmitted symbols across multiple tool responses. Previously-sent symbols become bare references instead of full declarations:

use gcf::{Session, encode_with_session};

let sess = Session::new();

let out1 = encode_with_session(&payload1, &sess); // full declarations
let out2 = encode_with_session(&payload2, &sess); // reused symbols as "@N  # previously transmitted"

By the 5th call in a session: 92.7% token savings vs JSON.

Streaming Encode

Write GCF output incrementally as symbols and edges arrive. Zero buffering, O(1) memory per row:

use gcf::{StreamEncoder, StreamOptions, Symbol, Edge};

let enc = StreamEncoder::new(writer, "context_for_task", StreamOptions {
    token_budget: 5000,
    ..Default::default()
});

enc.write_symbol(&Symbol { qualified_name: "pkg.Auth".into(), kind: "function".into(), score: 0.95, provenance: "lsp".into(), distance: 0, ..Default::default() });
enc.write_edge(&Edge { source: "pkg.Server".into(), target: "pkg.Auth".into(), edge_type: "calls".into(), ..Default::default() });
enc.close();

Output uses [?] deferred counts and ## _summary trailer. Standard decode() handles streaming output with no changes. Thread-safe via Mutex.

Delta Encoding

When the consumer already has a prior context pack, send only what changed:

use gcf::{DeltaPayload, Symbol, encode_delta};

let delta = DeltaPayload {
    tool: "context_for_task".to_string(),
    base_root: "aaa111".to_string(),
    new_root: "bbb222".to_string(),
    removed: vec![Symbol {
        qualified_name: "pkg.OldFunc".to_string(),
        kind: "function".to_string(),
        score: 0.0,
        provenance: String::new(),
        distance: 0,
        signature: String::new(),
        components: Default::default(),
    }],
    added: vec![Symbol {
        qualified_name: "pkg.NewFunc".to_string(),
        kind: "function".to_string(),
        score: 0.85,
        provenance: "rwr".to_string(),
        distance: 0,
        signature: String::new(),
        components: Default::default(),
    }],
    removed_edges: vec![],
    added_edges: vec![],
    delta_tokens: 30,
    full_tokens: 200,
};

let output = encode_delta(&delta);

81.2% savings on re-queries where the pack changed slightly.

Generic Encoding

Encode any serde_json::Value (not just graph payloads) into GCF tabular format:

use gcf::encode_generic;
use serde_json::json;

let data = json!({
    "employees": [
        {"id": 1, "name": "Alice", "department": "Engineering", "salary": 95000},
        {"id": 2, "name": "Bob", "department": "Sales", "salary": 72000},
    ],
});
let output = encode_generic(&data);

Output:

## employees [2]{department,id,name,salary}
Engineering|1|Alice|95000
Sales|2|Bob|72000

Works on objects, arrays, and primitives. Arrays of uniform objects get tabular rows. Nested objects use ## key section headers.

API

FunctionDescription
encode(p: &Payload) -> StringEncode a graph payload to GCF text
encode_generic(data: &Value) -> StringEncode any JSON value to GCF tabular format
decode(input: &str) -> Result<Payload, DecodeError>Parse GCF text back to a Payload
encode_with_session(p: &Payload, s: &Session) -> StringEncode with session deduplication
encode_delta(d: &DeltaPayload) -> StringEncode a delta (added/removed only)
Session::new() -> SessionCreate a new session tracker (thread-safe via Mutex)

Types

TypePurpose
PayloadFull GCF payload: tool, budget, symbols, edges, pack root
SymbolGraph node: qualified name, kind, score, provenance, distance
EdgeDirected relationship: source, target, edge type
DeltaPayloadDiff between two packs: added/removed symbols and edges
ComponentsScore breakdown: blast_radius, confidence, recency, distance
SessionThread-safe tracker for multi-call deduplication
DecodeErrorEnum of decode failure modes

Benchmarks

2,400+ LLM evaluations across 10 models, 3 providers, and 51 independent test runs.

GCFTOONJSON
Comprehension (23 runs, 10 models)91.2%68.2%53.4%
Generation (28 runs, 9 models)5/51.0/55.0/5
Input tokens (500 symbols)11,09016,37853,341
Output tokens (100 symbols)5,9768,93716,121

GCF wins 15/16 datasets on the expanded token efficiency benchmark. Full results: gcformat.com/guide/benchmarks

Implementations

LanguagePackageRepository
Gogo get github.com/blackwell-systems/gcf-gogcf-go
TypeScriptnpm install @blackwell-systems/gcfgcf-typescript
Pythonpip install gcf-pythongcf-python
Rustcargo add gcfgcf-rust
SwiftSwift Package Managergcf-swift
KotlinJitPackgcf-kotlin
MCP Proxypip install gcf-proxygcf-proxy (bidirectional, session dedup, HTTP frontend)
Claude Code Plugin/plugin installgcf-claude-plugin (one-command install, session stats hook)
Codex Plugincodex plugin addgcf-codex-plugin (one-command install, session stats hook)
VS Codeext install blackwell-systems.gcf-vscodegcf-vscode (syntax highlighting)
n8nnpm install n8n-nodes-gcfgcf-n8n-nodes (workflow encode/decode)
Tree-sitternpm install tree-sitter-gcftree-sitter-gcf

Minimal dependencies. Permanently. Rust implementation depends only on serde and serde_json for JSON interop. Five other implementations (Go, TypeScript, Python, Swift, Kotlin) have zero runtime dependencies. No unnecessary transitive dependencies. No supply chain risk. This is a permanent commitment: GCF will never take on external runtime dependencies beyond what the language ecosystem requires for JSON handling. MIT licensed. All implementations support both generic profile (encode_generic) and graph profile (encode). CLI included in all 6 languages.

Specification: SPEC v3.2 Stable with 174 conformance fixtures, 43,000,000,000+ lossless round-trips verified across 5 formats and 6 languages. All implementations at v2.2.1+ (Go v1.3.1). Cross-language 6x6 matrix verified.

License

MIT - Dayna Blackwell