๐Ÿ”€ crdt-merge

March 27, 2026 ยท View on GitHub

๐Ÿ”€ crdt-merge

Conflict-free merge, dedup & diff for any dataset โ€” powered by CRDTs

Maven Central Java 17+ License Tests: 79/79

Merge any two datasets in one function call. No conflicts. No coordination. No data loss.

Quick Start โ€ข API Reference โ€ข Why CRDTs โ€ข All Languages


๐ŸŒ Available in Every Language

LanguagePackageInstallRepo
Python ๐Ÿcrdt-mergepip install crdt-mergecrdt-merge
TypeScriptcrdt-mergenpm install crdt-mergecrdt-merge-ts
Rust ๐Ÿฆ€crdt-mergecargo add crdt-mergecrdt-merge-rs
Java โ˜•io.optitransfer:crdt-mergeMaven / GradleYou are here
CLI ๐Ÿ–ฅ๏ธincluded in Rustcargo install crdt-mergecrdt-merge-rs

๐Ÿค— Try it in the browser โ†’


๐ŸŽฏ The Problem

You have two versions of a dataset. Maybe two Spark jobs ran in parallel. Maybe two microservices updated the same records. Maybe you're merging data from multiple sources.

Today: Write custom merge scripts, lose data, or block on a coordinator.

With crdt-merge: One method call. Zero conflicts. Mathematically guaranteed.

List<Map<String, Object>> merged = CrdtMerge.merge(datasetA, datasetB, "id"); // done.

โšก Quick Start

Maven

<dependency>
  <groupId>io.optitransfer</groupId>
  <artifactId>crdt-merge</artifactId>
  <version>0.1.0</version>
</dependency>

Gradle

implementation 'io.optitransfer:crdt-merge:0.1.0'

From Source

git clone https://github.com/mgillr/crdt-merge-java.git
cd crdt-merge-java
mvn package

๐Ÿ“– API Reference

Merge Two Datasets

import io.optitransfer.crdtmerge.CrdtMerge;

List<Map<String, Object>> teamA = List.of(
    Map.of("id", 1, "name", "Alice", "role", "engineer"),
    Map.of("id", 2, "name", "Bob", "role", "designer")
);

List<Map<String, Object>> teamB = List.of(
    Map.of("id", 2, "name", "Robert", "role", "designer"),
    Map.of("id", 3, "name", "Charlie", "role", "pm")
);

List<Map<String, Object>> merged = CrdtMerge.merge(teamA, teamB, "id");
// id=1: Alice (only in A โ€” preserved)
// id=2: Robert (B wins โ€” latest)
// id=3: Charlie (only in B โ€” preserved)

Deduplicate

import io.optitransfer.crdtmerge.DedupEngine;

List<Map<String, Object>> data = List.of(
    Map.of("name", "Alice"),
    Map.of("name", "Alicia"),
    Map.of("name", "Bob")
);

DedupEngine.DedupResult result = CrdtMerge.dedup(data, "name", 0.7);
System.out.println("Unique: " + result.unique.size());
System.out.println("Duplicates: " + result.duplicates.size());

Structural Diff

import io.optitransfer.crdtmerge.DiffEngine;

DiffEngine.DiffResult diff = CrdtMerge.diff(oldData, newData, "id");
System.out.println(diff.summary);
// "+5 added, -2 removed, ~3 modified, =990 unchanged"

Deep JSON Merge

import com.google.gson.JsonObject;

JsonObject configA = JsonParser.parseString(
    "{\"model\": {\"name\": \"bert\", \"layers\": 12}, \"tags\": [\"nlp\"]}"
).getAsJsonObject();

JsonObject configB = JsonParser.parseString(
    "{\"model\": {\"name\": \"bert-large\", \"dropout\": 0.1}, \"tags\": [\"qa\"]}"
).getAsJsonObject();

JsonObject merged = CrdtMerge.mergeJson(configA, configB);
// {"model": {"name": "bert-large", "layers": 12, "dropout": 0.1}, "tags": ["nlp", "qa"]}

Core CRDT Types

import io.optitransfer.crdtmerge.crdt.*;

// Distributed counter
GCounter counterA = new GCounter();
counterA.increment("server-1", 100);

GCounter counterB = new GCounter();
counterB.increment("server-2", 200);

GCounter merged = counterA.merge(counterB);
System.out.println(merged.value()); // 300

// Last-writer-wins register
LWWRegister<String> regA = new LWWRegister<>("Alice", 1000L);
LWWRegister<String> regB = new LWWRegister<>("Alicia", 2000L);
System.out.println(regA.merge(regB).value()); // "Alicia" (later wins)

// Observed-remove set
ORSet<String> setA = new ORSet<>();
setA.add("item1");
ORSet<String> setB = new ORSet<>();
setB.add("item2");
ORSet<String> mergedSet = setA.merge(setB);
System.out.println(mergedSet.contains("item1")); // true
System.out.println(mergedSet.contains("item2")); // true

๐Ÿง  Why CRDTs

CRDT = Conflict-free Replicated Data Type. A data structure with one mathematical superpower:

Any two copies can merge โ€” in any order, at any time โ€” and the result is always identical and always correct.

Three mathematical guarantees (proven, not hoped):

PropertyWhat it means
Commutativemerge(A, B) == merge(B, A) โ€” order doesn't matter
Associativemerge(merge(A, B), C) == merge(A, merge(B, C)) โ€” grouping doesn't matter
Idempotentmerge(A, A) == A โ€” re-merging is safe

This means: zero coordination, zero locks, zero conflicts.

Built-in CRDT Types

TypeUse CaseExample
GCounterGrow-only countersDownload counts, page views
PNCounterIncrement + decrementStock levels, balances
LWWRegister<T>Single value (latest wins)Name, email, status fields
ORSet<T>Add/remove setTags, memberships, dedup sets

Features

  • Tabular Merge โ€” Merge two lists of maps by primary key using CRDT LWW semantics
  • Deduplication โ€” Exact and fuzzy dedup using Jaccard similarity on character bigrams
  • Structural Diff โ€” See added, removed, and modified rows between two datasets
  • JSON Merge โ€” Deep merge of nested JSON objects with conflict-free resolution
  • Core CRDTs โ€” Production-ready GCounter, PNCounter, LWWRegister, ORSet
  • Zero config โ€” One dependency (Gson), works with any Map/List data

๐Ÿ—๏ธ Use Cases

  • Spark pipelines: Merge partitioned outputs without a coordinator
  • Microservices: Each service maintains local state, merge on demand
  • Event sourcing: Merge event streams from multiple sources
  • Data lakes: Combine datasets from different teams/regions
  • Cache reconciliation: Merge divergent cache states after network partition

Requirements

  • Java 17+
  • Gson 2.10.1+ (included via Maven)

Building

mvn compile   # Compile
mvn test      # Run tests (79/79 passing)
mvn package   # Create JAR

License

Licensed under the Apache License, Version 2.0.

Contributing? By opening a pull request, you agree to our Contributor License Agreement.

Copyright 2026 Ryan Gillespie / Optitransfer. See NOTICE for attribution requirements.

For commercial licensing inquiries: rgillespie83@icloud.com, data@optitransfer.ch


Built with math, not hope. ๐Ÿงฌ

โญ Star on GitHub โ€ข ๐Ÿค— Try on HuggingFace โ€ข ๐Ÿ“ฆ Maven Central