README.md

March 29, 2026 · View on GitHub

UnisonDB

Store, stream, and sync instantly — UnisonDB is a log-native, real-time database that replicates like a message bus for AI and Edge Computing.

What UnisonDB Is

UnisonDB is an open-source database designed specifically for Edge AI and Edge Computing.

It is a reactive, log-native and multi-model database built for real-time and edge-scale applications. UnisonDB combines a B+Tree storage engine with WAL-based (Write-Ahead Logging) replication over gRPC or S3-compatible blob/object storage, enabling near-instant fan-out replication across hundreds of nodes while preserving strong consistency and durability.

Replication Model

Writes are committed by a Raft quorum on the write servers (if enabled); read-only edge replicas and relayers can consume WAL through either a live gRPC stream or blob-backed replication using S3-compatible object storage.

Blob-backed replication changes the fan-out model:

The writer publishes WAL durably into object storage
Any number of readers can poll and catch up directly from the blob store
Teams already running S3 or MinIO do not need to maintain an always-on gRPC replication path for every replica

See cmd/examples/blobstore-minio for a local MinIO example.

Key Features

High Availability Writes: Raft consensus on write servers (quorum acks); relayer/replica use in-sync replica (ISR) replication
Streaming Replication: WAL replication over gRPC or S3-compatible blob/object storage
Blob Fan-Out: Publish WAL once into object storage and let N readers poll directly from S3/MinIO-backed replication stores
Multi-Modal Storage: Key-Value, Wide-Column, and Large Objects (LOB)
Real-Time Notifications: ZeroMQ-based(Side-car) change notifications with sub-millisecond latency
Durable: B+Tree storage with Write-Ahead Logging
Edge-First Design: Optimized for edge computing and local-first architectures
Namespace Isolation: Multi-tenancy support with namespace-based isolation

storage architecture

Use Cases

UnisonDB is built for distributed edge-first architectures systems where data and computation must live close together — reducing network hops, minimizing latency, and enabling real-time responsiveness at scale.

By co-locating data with the services that use it, UnisonDB removes the traditional boundary between the database and the application layer. Applications can react to local changes instantly, while UnisonDB’s WAL-based replication ensures eventual consistency across all replicas globally.

Fan-Out Scaling

UnisonDB can fan out updates to 100+ edge nodes in just a few milliseconds from a single upstream—and because it supports multi-hop relaying, that reach compounds naturally. Each hop carries the network + application latency of its link;

In a simple 2-hop topology:

Hop 1: Primary → 100 hubs (≈250–500ms)
Hop 2: Each hub → 100 downstream edge nodes (similar latency)
Total reach: 100 + 10,000 = 10,100 nodes

Even at 60k–80k SET ops/sec with 1 KB values, UnisonDB can propagate those updates across 10,000+ nodes within seconds—without Kafka, Pub/Sub, CDC pipelines, or heavyweight brokers. (See the Relayer vs Latency benchmarks below for measured numbers.)

Quick Start

# Clone the repository
git clone https://github.com/ankur-anand/unisondb
cd unisondb

# Build
go build -o unisondb ./cmd/unisondb

# Run in server mode (primary)
./unisondb server --config config.toml

# Use the HTTP API
curl -X PUT http://localhost:4000/api/v1/default/kv/mykey \
  -H "Content-Type: application/json" \
  -d '{"value":"bXl2YWx1ZQ=="}'

Documentation

UnisonDB implements a pluggable storage backend architecture supporting two BTree implementations:

BoltDB: Single-file, ACID-compliant BTree.
LMDB: Memory-mapped ACID-compliant BTree with copy-on-write semantics.

Redis-Compatible Benchmark: UnisonDB vs BadgerDB vs BoltDB vs LMDB

This benchmark compares the write and read performance of four databases — UnisonDB, BadgerDB, LMDB* and BoltDB — using a Redis-compatible interface and the official redis-benchmark tool.

What We Measured

Throughput: Requests per second for SET (write) and GET (read) operations
Latency: p50 latency in milliseconds
Workload: 50 iterations of mixed SET and GET operations (200k ops per run)
Concurrency: 10 parallel clients, 10 pipelined requests, 4 threads
Payload Size: 1KB

Test Environment

Chip: Apple M2 Pro
Total Number of Cores: 10 (6 performance and 4 efficiency)
Memory: 16 GB
Unisondb Btree Backend - LMDB

All three databases were tested under identical conditions to highlight differences in write path efficiency, read performance, and I/O characteristics. The Redis-compatible server implementation can be found in internal/benchtests/cmd/redis-server/.

Results

UnisonDB, BadgerDB, BoltDB, LMDB Comparison

Performance Testing: Local Replication

Test Setup

We validated the WAL-based replication architecture using the pkg/replicator component. It Uses the same redis-compatible bench tool but this time the server is started a n=[100,200,500,750,1000] goroutine that is an independent WAL reader, capturing critical performance metrics:

Physical Latency Tracking: Measures p50, p90, p99, and max latencies Vs Relayer.
SET, GET Latency vs Relayer
SET, GET Throughput Vs Relayer.

Results

Test Replication Flow

Why UnisonDB

Traditional databases persist. Stream systems propagate. UnisonDB does both — turning every write into a durable, queryable stream that replicates seamlessly across the edge.

The Problem: Storage and Streaming Live in Different Worlds

Modern systems are reactive — every change needs to propagate instantly to dashboards, APIs, caches, and edge devices.
Yet, databases were built for persistence, not propagation.

You write to a database, then stream through Kafka.
You replicate via CDC.
You patch syncs between cache and storage.

This split between state and stream creates friction:

Two systems to maintain and monitor
Eventual consistency between write path and read path
Network latency on every read or update
Complex fan-out when scaling to hundreds of edges

The Gap

LMDB and BoltDB excel at local speed — but stop at one node.
etcd and Consul replicate state — but are consensus-bound and small-cluster only.
Kafka and NATS stream messages — but aren’t queryable databases.

System	Strength	Limitation
LMDB / BoltDB	Fast local storage	No replication
etcd / Consul	Cluster consistency	No local queries, low fan-out
Kafka / NATS	Scalable streams	No storage or query model

The Solution: Log-Native by Design

UnisonDB fuses database semantics with streaming mechanics — the log is the database.
Every write is durable, ordered, and instantly available as a replication stream.

No CDC, no brokers, no external pipelines.
Just one unified engine that:

Stores data in B+Trees for predictable reads
Streams data via WAL replication to thousands of nodes
Reacts instantly with sub-second fan-out
Keeps local replicas fully queryable, even offline

UnisonDB eliminates the divide between “database” and “message bus,”
enabling reactive, distributed, and local-first systems — without the operational sprawl.

UnisonDB collapses two worlds — storage and streaming — into one unified log-native core.
The result: a single system that stores, replicates, and reacts — instantly.

Core Architecture

UnisonDB is built on three foundational layers:

WALFS - Write-Ahead Log File System (mmap-based, optimized for reading at scale).
Engine - Hybrid storage combining WAL, MemTable, and B-Tree
Replication - WAL-based streaming with offset tracking

The Layered View

UnisonDB stacks a multi-model engine on top of WALFS — a log-native core that unifies storage, replication, and streaming into one continuous data flow.

+-----------------------------------------------------------+
|                Multi-Model API Layer                      |
|       (KV, Wide-Column, LOB, Txn Engine, Query)           |
+-----------------------------------------------------------+
|                   Engine Layer                            |
|   WALFS-backed MemTable + B-Tree Store                    |
|   (writes → WALFS, reads → B-Tree + MemTable)             |
+-----------------------------------------------------------+
|          WALFS (Core Log)          |  Replication Layer   |
|  Append-only, mmap-based           |  WAL-based streaming |
|  segmented log                     |  (followers tail WAL)|
|  Commit-ordered, replication-safe  |  Offset tracking,    |
|                                    |  catch-up, tailing   |
+-----------------------------------------------------------+
|                       Disk                                |
+-----------------------------------------------------------+

1. WALFS (Write-Ahead Log)

Overview

WALFS is a memory-mapped, segmented write-ahead log implementation designed for both writing AND reading at scale. Unlike traditional WALs that optimize only for sequential writes, WALFS provides efficient random access for replication, and real-time tailing.

Segment Structure

Each WALFS segment consists of two regions:

+----------------------+-----------------------------+-------------+
|   Segment Header     |        Record 1             |  Record 2   |
|     (64 bytes)       |  Header + Data + Trailer    |     ...     |
+----------------------+-----------------------------+-------------+

Segment Header (64 bytes)

Offset	Size	Field	Description
0	4	Magic	Magic number (`0x5557414C`)
4	4	Version	Metadata format version
8	8	CreatedAt	Creation timestamp (nanoseconds)
16	8	LastModifiedAt	Last modification timestamp (nanoseconds)
24	8	WriteOffset	Offset where next chunk will be written
32	8	EntryCount	Total number of chunks written
40	4	Flags	Segment state flags (e.g. Active, Sealed)
44	12	Reserved	Reserved for future use
56	4	CRC	CRC32 checksum of first 56 bytes
60	4	Padding	Ensures 64-byte alignment

Record Format (8-byte aligned)

Each record is written in its own aligned frame:

Offset	Size	Field	Description
0	4 bytes	CRC	CRC32 of `[Length \| Data]`
4	4 bytes	Length	Size of the data payload in bytes
8	N bytes	Data	User payload (FlatBuffer-encoded LogRecord)
8 + N	8 bytes	Trailer	Canary marker (`0xDEADBEEFFEEEDFACE`)
...	≥0 bytes	Padding	Zero padding to align to 8-byte boundary

WALFS Reader Capabilities

WALFS provides powerful reading capabilities essential for replication and recovery:

1. Forward-Only Iterator

reader := walLog.NewReader()
defer reader.Close()

for {
    data, pos, err := reader.Next()
    if err == io.EOF {
        break
    }
    // Process record
}

Zero-copy reads - data is a memory-mapped slice
Position tracking - each record returns its (SegmentID, Offset) position
Automatic segment traversal - seamlessly reads across segment boundaries

2. Offset-Based Reads

// Read from a specific offset (for replication catch-up)
offset := Offset{SegmentID: 5, Offset: 1024}
reader, err := walLog.NewReaderWithStart(&offset)

Use cases:

Efficient seek without scanning
Follower catch-up from last synced position
Recovery from checkpoint

3. Active Tail Following

// For real-time replication (tailing active WAL)
reader, err := walLog.NewReaderWithTail(&offset)

for {
    data, pos, err := reader.Next()
    if err == ErrNoNewData {
        // No new data yet, can retry or wait
        continue
    }
}

Behavior:

Returns ErrNoNewData when caught up (not io.EOF)
Enables low-latency streaming
Supports multiple parallel readers

Why WALFS is Different

Unlike traditional "write-once, read-on-crash" WALs, WALFS optimizes for:

Continuous replication - Followers constantly read from primary's WAL
Real-time tailing - Low-latency streaming of new writes
Parallel readers - Multiple replicas read concurrently without contention

2. Engine (dbkernel)

Overview

The Engine orchestrates writes, reads, and persistence using three components:

WAL (WALFS) - Durability and replication source
MemTable (SkipList) - In-memory write buffer
B-Tree Store - Persistent index for efficient reads

Flow Diagram

FlatBuffer Schema

UnisonDB uses FlatBuffers for zero-copy serialization of WAL records:

Benefits:

No deserialization on replicas
Fast replication

Why FlatBuffers?

Replication efficiency - No deserialization needed on replicas

Transaction Support

UnisonDB provides atomic multi-key transactions:

txn := engine.BeginTxn()
txn.Put("k1", value1)
txn.Put("k2", value2)
txn.Put("k3", value3)
txn.Commit() // All or nothing

Flow

Transaction Properties:

Atomicity - All writes become visible on commit, or none on abort
Isolation - Uncommitted writes are hidden from readers

LOB (Large Object) Support

Large values can be chunked and streamed using TXN.

Flow

LOB Properties:

Transactional - All chunks committed atomically
Streaming - Can write/read chunks incrementally
Efficient replication - Replicas get chunks as they arrive

Wide-Column Support

UnisonDB supports partial updates to column families:

Benefits:

Efficient updates - Only modified columns are written/replicated
Flexible schema - Columns can be added dynamically
Merge semantics - New columns merged with existing row

3. Replication Architecture

Overview

Replication in UnisonDB is WAL-based streaming - designed around the WALFS reader capabilities. Followers continuously stream WAL records from the primary's WALFS and apply them locally.

Design Principles

Offset-based positioning - Followers track their replication offset (SegmentID, Offset)
Catch-up from any offset - Can resume replication from any position
Real-time streaming - Active tail following for low-latency replication
Self-describing records - FlatBuffer LogRecords are self-contained
Batched streaming - Records sent in batches for efficiency

Replication Flow

Offset-based positioning - Followers track (SegmentID, Offset) Independently.
Catch-up from any offset - Resume from any position
Real-time streaming - Active tail following for low latency

Why is Traditional KV Replication Insufficient?

Most traditional key-value stores were designed for simple, point-in-time key-value operations — and their replication models reflect that. While this works for basic use cases, it quickly breaks down under real-world demands like multi-key transactions, large object handling, and fine-grained updates.

Key-Level Replication Only

Replication is often limited to raw key-value pairs. There’s no understanding of higher-level constructs like rows, columns, or chunks — making it impossible to efficiently replicate partial updates or large structured objects.

No Transactional Consistency

Replication happens on a per-operation basis, not as part of an atomic unit. Without multi-key transactional guarantees, systems can fall into inconsistent states across replicas, especially during batch operations, network partitions, or mid-transaction failures.

Transactional, multi-key replication with commit visibility guarantees.
Chunked LOB writes that are fully atomic.
Column-aware replication for efficient syncing of wide-column updates.
Isolation by default — once a network-aware transaction is started, all intermediate writes are fully isolated and not visible to readers until a successful txn.Commit().
Built-in replication via gRPC WAL streaming + B-Tree snapshots.
Zero-compaction overhead, high write throughput, and optimized reads.

Development

make lint
make test

certificate for Local host

brew install mkcert

## install local CA
mkcert -install

## Generate gRPC TLS Certificates
## these certificate are valid for hostnames/IPs localhost 127.0.0.1 ::1

mkcert -key-file grpc.key -cert-file grpc.crt localhost 127.0.0.1 ::1

License

Apache License, Version 2.0