crdt-merge Documentation
April 8, 2026 · View on GitHub
Production-grade CRDT merging for data, ML models, and distributed systems.
| Version | 0.9.5 |
| Architecture | 7-layer + Accelerators + CLI |
| Codebase | 44,304 LOC · 104 modules · 212 classes · 1,586 functions |
| Guide tests | 309 tests across 7 guide test suites — all passing |
| License | BUSL-1.1 → Apache 2.0 (2028-03-29) |
Guides
25 in-depth guides covering every major use case. Every code example is verified by an automated test suite.
Data & Records
| Guide | What it covers |
|---|---|
| CRDT Fundamentals | CRDT theory, OR-Set, LWW, G-Counter |
| CRDT Primitives Reference | Working examples for every primitive type |
| CRDT Verification Toolkit | verify_crdt, verify_commutative, property testing |
| Merge Strategies | LWW, MaxWins, MinWins, UnionSet, Priority, Custom, and more |
| Schema Evolution | Backwards-compatible schema changes |
| MergeQL — Distributed Knowledge | SQL-like merge interface |
| Probabilistic CRDT Analytics | HyperLogLog, MinHash, CMS |
| Performance Tuning | parallel_merge, chunking, DuckDB, profiling |
| Troubleshooting | Common errors and fixes |
Transport & Sync
| Guide | What it covers |
|---|---|
| Wire Protocol | Binary format, serialize/deserialize, peek_type |
| Gossip & Serverless Sync | GossipState, peer-to-peer propagation |
| Delta Sync & Merkle Verification | Bandwidth-efficient sync, content integrity |
AI & ML Models
| Guide | What it covers |
|---|---|
| Federated Model Merging | CRDTMergeState, 26 strategies, no parameter server |
| Model Merge Strategies | SLERP, TIES, DARE, DARE-TIES, Fisher, and more |
| Model CRDT Matrix | Strategy × CRDT-compliance comparison table |
| LoRA Adapter Merging | LoRAMerge, LoRAMergeSchema, per-layer strategies |
| Continual Learning Without Forgetting | ContinualMerge, replay, EWC integration |
Agentic & Context
| Guide | What it covers |
|---|---|
| Convergent Multi-Agent AI | AgentState, ContextMerge, ContextManifest |
| Agentic Memory at Scale | ContextBloom, MemorySidecar, budget-bounded merge |
Privacy, Provenance & Compliance
| Guide | What it covers |
|---|---|
| Provenance — Complete AI | AuditLog, AuditedMerge, tamper-evident chain |
| Right to Forget in AI | CRDT remove(), GDPR erasure, model unmerge |
| Privacy-Preserving Merge | EncryptedMerge, field-level encryption, RBAC |
| Security Hardening | Threat model, key rotation, audit integration |
| Security Guide | Encryption backends, StaticKeyProvider, RBAC policies |
| Compliance Guide | GDPR Art.5, HIPAA PHI, SOX, EU AI Act |
E4 Trust-Delta Architecture
The E4 subsystem adds recursive trust as a native CRDT dimension. Every merge carries proof, every delta carries trust.
- Overview -- value proposition, quick start, benchmark summary
- Architecture -- full technical specification (product lattice, trust algebra, wire format)
- Developer Guide -- building with E4 trust scoring
- Integration Guide -- wiring E4 into existing systems
- API Reference -- complete module and class reference
- Security Model -- threat model, Byzantine defence, resilience
- Peer Review -- expert evaluation (9.0/10)
- Computational Evidence -- H100 validation results
- Changelog -- E4-specific release history
E4 API Reference (detailed)
| Document | Scope |
|---|---|
| E4 Core Modules | TypedTrustScore, PCO, ProjectionDelta, TrustBoundMerkle, CausalTrustClock, and more |
| E4 Integration Bridges | GossipBridge, StreamBridge, AgentBridge, E4Config, bootstrap |
| E4 Resilience Subsystem | 18 hardening modules: Sybil defence, post-quantum sigs, epoch rotation, partition reconciliation |
Cookbooks
Runnable tutorials covering common E4 workflows end-to-end.
| Cookbook | What it covers |
|---|---|
| E4 Trust Quickstart | Trust scoring, PCO, projection deltas, adaptive verification, disabling E4 |
| Federated Trust with Byzantine Tolerance | Multi-peer lattice, gossip bridge, Byzantine detection, circuit breaker, 5-node demo |
| Trust-Weighted Agent Memory Synchronisation | Agent bridge, trust-weighted conflict resolution, multi-agent convergence, stream validation |
30-Second Demo
import pandas as pd
from crdt_merge import merge, MergeSchema, LWW, MaxWins
df_a = pd.DataFrame({"id": [1, 2], "name": ["Alice", "Charlie"], "score": [80, 70], "_ts": [1000.0, 1000.0]})
df_b = pd.DataFrame({"id": [1, 3], "name": ["Bob", "Diana" ], "score": [90, 85], "_ts": [2000.0, 1000.0]})
schema = MergeSchema(name=LWW(), score=MaxWins())
result = merge(df_a, df_b, key="id", schema=schema, timestamp_col="_ts")
# id name score _ts
# 0 1 Bob 90 2000.0 ← Bob (newer), 90 (higher)
# 1 2 Charlie 70 1000.0 ← Only in df_a
# 2 3 Diana 85 1000.0 ← Only in df_b
Model merging:
import numpy as np
from crdt_merge.model import CRDTMergeState
# Three teams fine-tuning the same base — merge in any order, get identical result
team_a = CRDTMergeState("weight_average")
team_b = CRDTMergeState("weight_average")
team_c = CRDTMergeState("weight_average")
team_a.add(math_tensors, model_id="llama-math-v2", weight=0.4)
team_b.add(code_tensors, model_id="llama-code-v4", weight=0.35)
team_c.add(reasoning_tensors, model_id="llama-reason-v1", weight=0.25)
team_a.merge(team_b).merge(team_c) # in-place, returns self
assert team_a.state_hash == team_b.merge(team_c).state_hash # identical regardless of order
merged = team_a.resolve()
Learning Path
New to crdt-merge?
- CRDT Fundamentals — understand OR-Sets and convergence (15 min)
- CRDT Primitives Reference — hands-on with every type (20 min)
- Merge Strategies — pick the right strategy for your data (10 min)
- Choose your domain:
- Data/DataFrames → MergeQL, Performance Tuning
- ML Models → Federated Model Merging, LoRA
- Distributed agents → Convergent Multi-Agent AI
- Compliance → Provenance, Compliance Guide
Repository Layout
docs/
├── guides/ ← 25 in-depth guides (all code verified by tests)
├── api-reference/ ← Complete API reference (layers 1–6, accelerators, CLI)
├── architecture/ ← System overview, layer map, data flow, design decisions
├── getting-started/ ← Installation, quickstart, core concepts
├── cookbook/ ← Practical recipes and patterns
├── cookbooks/ ← E4 trust-delta runnable tutorials
├── e4.md ← E4 core modules API reference
├── e4_integration.md ← E4 integration bridges API reference
├── e4_resilience.md ← E4 resilience subsystem API reference
├── CRDT_ARCHITECTURE.md ← Full mathematical proof of CRDT compliance
├── ARCHITECTURE_MAP.md ← Annotated codebase map
└── benchmarks/ ← A100 GPU performance, stress test reports
Architecture
crdt-merge uses a strict 7-layer architecture — each layer is independently testable and composable:
| Layer | Module | Responsibility |
|---|---|---|
| 1 | crdt_merge.core | OR-Set, G-Counter, LWW-Register, VectorClock |
| 2 | crdt_merge | DataFrame/JSON merge, strategies, MergeQL |
| 3 | crdt_merge.wire / .gossip / .merkle | Transport, serialisation, content integrity |
| 4 | crdt_merge.model | ML model merging, CRDTMergeState, 26 strategies |
| 5 | crdt_merge.encryption / .rbac / .metrics | Security, access control, observability |
| 6 | crdt_merge.compliance | GDPR, HIPAA, SOX, EU AI Act |
| 7 | crdt_merge.e4 | Trust-delta protocol, PCO, Byzantine resilience, adaptive verification |
| + | crdt_merge.context / .agentic | Agent memory, ContextBloom, ContextManifest |
| + | Accelerators | DuckDB, dbt, Polars, Airbyte, Spark |
Full proof: CRDT_ARCHITECTURE.md
By Role
| Role | Start here |
|---|---|
| Developer | Quickstart → Primitives Reference |
| ML Engineer | Federated Model Merging → LoRA |
| Data Engineer | Merge Strategies → MergeQL |
| Architect | ARCHITECTURE_MAP.md → api-reference/ |
| Compliance | Compliance Guide → Right to Forget |
| Security | Security Guide → Privacy-Preserving Merge |
crdt-merge v0.9.5 · April 2026