VelesDB Architecture

June 14, 2026 · View on GitHub

This document describes the internal architecture of VelesDB.

Architecture Status Update (2026-02-26)

VelesDB core architecture is explicitly hybrid by design:

  • Vector engine with 5 metrics (Cosine, Euclidean, DotProduct, Hamming, Jaccard) and SIMD acceleration.
  • Graph engine for nodes/edges/traversal inside collection runtime.
  • Multi-column engine (ColumnStore) for typed filtering and bitmap operations.
  • VelesQL control plane (parser/validation/planning/cache) orchestrating cross-domain execution paths.

High-Level Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                           CLIENT LAYER                                   │
├─────────────────────────────────────────────────────────────────────────┤
│  TypeScript SDK │ Python SDK │ REST Client │ VelesQL CLI │ Mobile SDK  │
│  (@velesdb/sdk) │ (velesdb)  │ (curl/HTTP) │ (velesdb)   │ (iOS/Android)│
└───────┬─────────┴──────┬─────┴───────┬─────┴──────┬──────┴──────┬──────┘
         │                  │                 │                 │
         ▼                  ▼                 ▼                 ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                           API LAYER                                      │
├─────────────────────────────────────────────────────────────────────────┤
│   WASM Module     │   Python Bindings   │    REST Server    │   CLI    │
│  (velesdb-wasm)   │   (velesdb-python)  │  (velesdb-server) │  (REPL)  │
│                   │       PyO3          │      Axum         │          │
└────────┬──────────┴───────┬─────────────┴────────┬──────────┴────┬─────┘
         │                  │                      │               │
         ▼                  ▼                      ▼               ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                          CORE ENGINE                                     │
│                         (velesdb-core)                                   │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌─────────────┐ │
│  │   Database   │  │  Collection  │  │   VelesQL    │  │   Filter    │ │
│  │  Management  │  │  Operations  │  │   Parser     │  │   Engine    │ │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └──────┬──────┘ │
│         │                 │                 │                 │         │
│         ▼                 ▼                 ▼                 ▼         │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                       INDEX LAYER                                │   │
│  ├─────────────────────────────────────────────────────────────────┤   │
│  │   ┌─────────────┐    ┌─────────────┐    ┌─────────────────────┐ │   │
│  │   │  HNSW Index │    │ BM25 Index  │    │  ColumnStore Filter │ │   │
│  │   │  (ANN)      │    │ (Full-Text) │    │  (RoaringBitmap)    │ │   │
│  │   └──────┬──────┘    └──────┬──────┘    └──────────┬──────────┘ │   │
│  └──────────┼──────────────────┼─────────────────────┼─────────────┘   │
│             │                  │                     │                  │
│             ▼                  ▼                     ▼                  │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                     DISTANCE LAYER (SIMD)                        │   │
│  ├─────────────────────────────────────────────────────────────────┤   │
│  │  Cosine  │  Euclidean  │  Dot Product  │  Hamming  │  Jaccard   │   │
│  │  (33.1ns)│   (22.5ns)  │    (19.8ns)   │  (35.8ns) │   (35.1ns) │   │
│  │                                                                  │   │
│  │  AVX2/AVX-512 │ ARM64 NEON │ Scalar fallback (incl. WASM —    │   │
│  │               │ (simd_neon)│ SIMD128 planned)                 │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
└────────────────────────────────────┬────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────┐
│                         STORAGE LAYER                                    │
├─────────────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────────┐ │
│  │ Vector Data │  │   Payload   │  │     WAL     │  │  Binary Export │ │
│  │  (mmap)     │  │   Storage   │  │  (durability│  │  (VELS format) │ │
│  └─────────────┘  └─────────────┘  └─────────────┘  └────────────────┘ │
│                                                                          │
│  File System / Memory / IndexedDB (WASM)                                │
└─────────────────────────────────────────────────────────────────────────┘

Component Details

1. Client Layer

ComponentLanguagePurpose
TypeScript SDKTypeScriptUnified client for browser/Node.js
Python SDKPythonNative bindings via PyO3
Mobile SDKSwift/KotlinNative iOS and Android bindings via UniFFI
REST ClientAnyHTTP API access
VelesQL CLIRustInteractive query REPL

2. API Layer

velesdb-wasm

  • WebAssembly module for browser/Node.js
  • Scalar distance calculations (SIMD128 kernels planned)
  • IndexedDB persistence via binary export/import
  • ~430 KB gzipped (v1.18.0 npm artifact)

velesdb-server

  • Axum-based REST API server
  • OpenAPI/Swagger documentation
  • 48 REST endpoints (55 method+path operations)
  • Prometheus metrics served by default (GET /metrics)

velesdb-python

  • PyO3 bindings for Python
  • NumPy array support
  • Zero-copy when possible

velesdb-mobile

  • UniFFI bindings for iOS (Swift) and Android (Kotlin)
  • Thread-safe Arc-wrapped handles
  • StorageMode support (Full, SQ8, Binary) for IoT/Edge
  • Targets: aarch64-apple-ios, aarch64-linux-android, etc.

3. Core Engine (velesdb-core)

Database

  • Collection management via typed registries:
    • vector_collections: HashMap<String, VectorCollection>
    • graph_collections: HashMap<String, GraphCollection>
    • metadata_collections: HashMap<String, MetadataCollection>
  • Multi-collection support
  • Automatic persistence

Collection

  • Three typed collection variants: VectorCollection, GraphCollection, MetadataCollection
  • Point CRUD operations
  • Vector search (single & batch)
  • Text search (BM25)
  • Hybrid search (vector + text)

VelesQL Parser (v2.0)

  • SQL-like query language
  • ~1.3M queries/sec parsing
  • Bound parameters support
  • v2.0 Features:
    • GROUP BY / HAVING (AND/OR)
    • ORDER BY (multi-column, similarity)
    • JOIN with aliases
    • UNION / INTERSECT / EXCEPT
    • USING FUSION (hybrid search)
    • WITH (max_groups, group_limit)

Filter Engine

  • ColumnStore-based filtering (adaptive per-collection payload mirror in the SELECT ... WHERE path)
  • RoaringBitmap for set operations
  • Up to 130x faster than JSON filtering (filtering-API micro-benchmark)

Aggregation Engine (EPIC-017/018)

  • Streaming aggregation executor
  • Performance Optimizations:
    • process_batch() - SIMD-friendly vectorized aggregation
    • Parallel aggregation with Rayon (10K+ datasets)
    • Pre-computed hash for GROUP BY (vs JSON serialization)
    • String interning to avoid allocations in hot path
  • ~2x speedup on large aggregations

4. Knowledge Graph Layer (EPIC-019)

┌─────────────────────────────────────────────────────────────────────────┐
│                      KNOWLEDGE GRAPH ENGINE                              │
├─────────────────────────────────────────────────────────────────────────┤
│  ┌──────────────────┐  ┌──────────────────┐  ┌────────────────────────┐ │
│  │   GraphSchema    │  │    GraphNode     │  │      GraphEdge         │ │
│  │  (labels, types) │  │ (id, properties) │  │ (src, tgt, label, props)│ │
│  └────────┬─────────┘  └────────┬─────────┘  └───────────┬────────────┘ │
│           │                     │                        │              │
│           ▼                     ▼                        ▼              │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                    ConcurrentEdgeStore                           │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │   │
│  │  │  256 Shards │  │  edge_ids   │  │    Label Indices        │  │   │
│  │  │ (RwLock<    │  │  HashMap    │  │  by_label, outgoing_    │  │   │
│  │  │  EdgeStore>)│  │ (edge→src)  │  │  by_label               │  │   │
│  │  └──────┬──────┘  └──────┬──────┘  └───────────┬─────────────┘  │   │
│  │         │                │                     │                 │   │
│  │         ▼                ▼                     ▼                 │   │
│  │  ┌──────────────────────────────────────────────────────────┐   │   │
│  │  │  Optimized Operations:                                    │   │   │
│  │  │  • add_edge: O(1) with cross-shard dual-insert           │   │   │
│  │  │  • remove_edge: O(1) 2-shard lookup (not 256)            │   │   │
│  │  │  • get_edges_by_label: O(k) via label index              │   │   │
│  │  └──────────────────────────────────────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│  ┌──────────────────┐  ┌──────────────────┐  ┌────────────────────────┐ │
│  │   LabelTable     │  │   BfsIterator    │  │    GraphMetrics        │ │
│  │ String interning │  │ Streaming BFS    │  │  LatencyHistogram      │ │
│  │  LabelId (u32)   │  │ memory-bounded   │  │  node/edge counters    │ │
│  └──────────────────┘  └──────────────────┘  └────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘

Scalability (10M+ edges):

  • Adaptive sharding: 1-512 shards based on graph size
  • 2-shard removal: O(1) instead of O(256) lock acquisitions
  • Label indices: O(k) edge lookup by relationship type
  • String interning: ~60% memory reduction for labels
  • CsrSnapshot: Zero-copy CSR (Compressed Sparse Row) snapshot for cache-friendly BFS/DFS traversal. Built on-demand after load or build_read_snapshot(), auto-invalidated by writes. Returns neighbor IDs as contiguous &[u64] slices instead of per-shard edge lookups.
  • Parent-pointer BFS: BFS/DFS uses a FxHashMap parent-pointer map instead of cloning path vectors at every edge expansion. Paths are reconstructed on-demand via reconstruct_path() only when emitting results. Uses FxHashSet visited sets (via rustc_hash) for faster hashing than std::HashSet.
  • Parallel BFS: Multi-source BFS traversal (traverse_bfs_parallel) launches concurrent BFS from multiple source nodes with deduplication by path signature. Available across all components: server REST API, Python bindings (GIL-released), Mobile (UniFFI), Tauri plugin, and TypeScript SDK.

5. Index Layer

HNSW Index

                    Entry Point (Layer L)

            ┌─────────────┼─────────────┐
            ▼             ▼             ▼
         Node A ─────── Node B ─────── Node C   (Layer L-1)
            │             │             │
    ┌───────┼───────┐     │     ┌───────┼───────┐
    ▼       ▼       ▼     ▼     ▼       ▼       ▼
   ...     ...     ...   ...   ...     ...     ... (Layer 0)
  • Parameters:

    • M: Max connections per node (default: 24-32, auto-tuned by dimension)
    • ef_construction: Build-time search width (default: 300-400, auto-tuned by dimension)
    • ef_search: Query-time search width (default: 160, Balanced mode). An explicit WITH (ef_search = N) is passed through as the requested budget (clamped to at least k, and still subject to the standard dataset-size scaling), instead of being snapped to a coarse named profile. (Updated 2026-06-14.)
  • Features:

    • Thread-safe parallel insertions with lock-free CAS entry-point promotion
    • Graduated ef_construction (3-phase VAMANA/DiskANN schedule for batches >= 1000)
    • Pre-allocated vector storage (reserve + bulk push to minimize lock contention)
    • Automatic level assignment
    • Persistent storage with WAL recovery

BM25 Index

  • Term frequency / inverse document frequency
  • Tokenization with stopword removal
  • Persistent storage

ColumnStore

  • Columnar storage for typed metadata
  • String interning for efficient comparisons
  • RoaringBitmap for fast set operations

5. Distance Layer (SIMD)

MetricImplementationLatency (768D)
Dot ProductAVX2 FMA21.7 ns
EuclideanAVX2 FMA26.0 ns
CosineAVX2 4-acc, single-sqrt finish33.1 ns
HammingAVX2 FP-domain 4-acc35.8 ns
JaccardAVX-512 4-acc35.1 ns

Per-metric numbers above are the contract values in docs/reference/promise-contract.json. Raw micro-benchmark snapshots (March 27 2026 run on a specific machine) live in SIMD_PERFORMANCE.md and may differ by ~10% due to methodology / cache state.

SIMD Strategy:

  1. Native (x86_64): AVX2/AVX-512 via core::arch intrinsics with 4-accumulator ILP
  2. Native (aarch64): NEON 128-bit with 1-acc/4-acc variants
  3. WASM: scalar fallback (SIMD128 kernels planned; wasm32 dispatches to SimdLevel::Scalar)
  4. Fallback: Scalar with loop unrolling

6. Storage Layer

Vector Data

  • Memory-mapped files for large datasets
  • Contiguous f32 buffer for cache locality
  • Lazy loading support

Payload Storage

  • JSON-based payload storage
  • Nested field access with dot notation
  • Type-aware indexing

WAL (Write-Ahead Log)

  • Durability guarantees
  • Automatic recovery on restart
  • Configurable sync policy

Binary Export (WASM)

$ ┌────────┬─────────┬───────────┬────────┬─────────┬─────────────────────┐ │ "\text{VELS}" │ \text{Version} │ \text{Dimension} │ \text{Metric} │ \text{Count} │ \text{Vectors} │ │ 4 \text{bytes}│ 1 \text{byte} │ 4 \text{bytes} │ 1 \text{byte} │ 8 \text{bytes} │ (\text{id} + \text{data}) \times \text{count} │ └────────┴─────────┴───────────┴────────┴─────────┴─────────────────────┘ $

Data Flow

Vector Search Flow

Query Vector


┌─────────────────┐
│  VelesQL Parse  │ (optional)
└────────┬────────┘


┌─────────────────┐
│  Filter Engine  │ (if filters present:
│ (secondary idx +│  secondary indexes + JSON
│  JSON filters)  │  payload filters)
└────────┬────────┘


┌─────────────────┐
│   HNSW Search   │
│  (entry → L0)   │
└────────┬────────┘


┌─────────────────┐
│  SIMD Distance  │
│  Calculations   │
└────────┬────────┘


┌─────────────────┐
│  Top-K Results  │
│  (min-heap)     │
└────────┬────────┘


   Sorted Results

Hybrid Search Flow

Query Vector + Text Query

    ┌────┴────┐
    ▼         ▼
┌───────┐ ┌───────┐
│ HNSW  │ │ BM25  │
│Search │ │Search │
└───┬───┘ └───┬───┘
    │         │
    ▼         ▼
┌─────────────────┐
│  RRF Fusion     │
│ (Reciprocal     │
│  Rank Fusion)   │
└────────┬────────┘


   Merged Results

VelesQL v2.0 Query Flow

┌─────────────────────────────────────────────────────────────────┐
│                      VelesQL v2.0 Parser                         │
├─────────────────────────────────────────────────────────────────┤
│  SQL Query                                                       │
│    │                                                             │
│    ▼                                                             │
│  ┌────────────────┐                                              │
│  │  Pest Grammar  │  compound_query → select_stmt [set_op]       │
│  └────────┬───────┘                                              │
│           │                                                      │
│           ▼                                                      │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                        AST                                  │ │
│  ├────────────────────────────────────────────────────────────┤ │
│  │  Query {                                                    │ │
│  │    select: SelectStatement {                                │ │
│  │      columns, from, joins[], where_clause,                  │ │
│  │      group_by, having, order_by, limit, offset,             │ │
│  │      with_clause, fusion_clause                             │ │
│  │    },                                                       │ │
│  │    compound: Option<CompoundQuery> {                        │ │
│  │      operator: UNION|INTERSECT|EXCEPT,                      │ │
│  │      right: SelectStatement                                 │ │
│  │    }                                                        │ │
│  │  }                                                          │ │
│  └────────────────────────────────────────────────────────────┘ │
│           │                                                      │
│           ▼                                                      │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                    Execution Engine                         │ │
│  ├────────────────────────────────────────────────────────────┤ │
│  │  1. Filter Pushdown → ColumnStore                           │ │
│  │  2. Vector Search → HNSW (if NEAR clause)                   │ │
│  │  3. JOIN Execution → Cross-collection merge                 │ │
│  │  4. Aggregation → GROUP BY + HAVING                         │ │
│  │  5. Ordering → ORDER BY (columns, similarity)               │ │
│  │  6. Set Operations → UNION/INTERSECT/EXCEPT                 │ │
│  │  7. Fusion → RRF/Weighted/Maximum (if USING FUSION)         │ │
│  └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

VelesQL v2.0 Supported Syntax:

-- Aggregation with GROUP BY and HAVING
SELECT category, COUNT(*), AVG(price) 
FROM products 
GROUP BY category 
HAVING COUNT(*) > 5 AND AVG(price) > 50

-- ORDER BY with similarity function
SELECT * FROM docs 
ORDER BY similarity(vector, $query) DESC 
LIMIT 10

-- JOIN across collections
SELECT * FROM orders 
JOIN customers AS c ON orders.customer_id = c.id
WHERE status = 'active'

-- Set operations
SELECT * FROM active_users UNION SELECT * FROM archived_users

-- Hybrid fusion search (USING FUSION is a trailing clause: after LIMIT)
SELECT * FROM documents 
LIMIT 20 USING FUSION(strategy='rrf', k=60)

Performance Characteristics

Memory Usage

ComponentPer Vector (768D)
Vector Data (f32)3,072 bytes
Vector Data (f16)1,536 bytes
Vector Data (SQ8)768 bytes
HNSW Links~256 bytes
Payload (avg)~200 bytes

Throughput

OperationThroughput
Insert~3.8K-6.4K vec/sec (768D)
Search k=10 (10K vectors, 768D, HNSW index-only)~55 µs
Search end-to-end p50 (10K/384D, WAL ON, recall ≥ 96%)~450 µs
Search (100K vectors)< 5 ms
VelesQL Parse1.3M queries/sec
Export (WASM)4,479 MB/s
Import (WASM)2,943 MB/s

Platform Support

PlatformStatusSIMDPerformance
Linux x86_64✅ FullAVX2/AVX-512100%
Windows x86_64✅ FullAVX2100%
macOS x86_64✅ FullAVX2100%
macOS ARM64✅ FullNEON~90%
WASM (Browser)✅ FullScalar (SIMD128 planned)~70%
WASM (Node.js)✅ FullScalar (SIMD128 planned)~70%
iOS (ARM64)✅ FullNEON~90%
Android (ARM64)✅ FullNEON~90%
Android (ARMv7)✅ FullFallback~70%

ARM64 (Apple Silicon / Mobile) Note

On ARM64 platforms (macOS M1/M2/M3, iOS, Android), VelesDB uses native NEON SIMD instructions for distance calculations via the simd_native module, with both 1-accumulator and 4-accumulator variants depending on vector size.

Impact:

  • Distance calculations are ~10% slower than x86_64 with AVX2
  • All other operations (indexing, storage, queries) are unaffected
  • Overall search latency remains in the microsecond range

Future Architecture

Planned Components

┌─────────────────────────────────────────────────────────────────────────┐
│                       DISTRIBUTED LAYER (v1.0+)                          │
├─────────────────────────────────────────────────────────────────────────┤
│   Coordinator   │   Sharding   │   Replication   │   Consensus (Raft)  │
└─────────────────────────────────────────────────────────────────────────┘
  • GPU Acceleration: CUDA kernels for large-scale (wgpu-based, optional)

v1.6.0 Architecture Improvements (Shipped)

The following architectural changes, originally identified in the January 2026 technical audit, have been implemented as of v1.6.0:

ChangeBefore (v0.8.x)After (v1.6.0)
ConcurrencyGlobal RwLock<HashMap>DashMap + 16-shard storage
MemoryVec<f32> allocations per readZero-copy &[f32] from mmap
SIMD DispatchPer-call feature detectionOnceLock function pointer
Unsafe'static lifetime tricksSafe self-referential via ouroboros

See docs/internal/TECHNICAL_AUDIT_PLAN.md for the original audit plan.

Code-Truth Matrix (2026-02-26)

CapabilityRuntime module(s)Notes
VelesQL parser + validation + planningcrates/velesdb-core/src/velesql/*Query control plane and parse cache
Vector engine (5 metrics)crates/velesdb-core/src/distance/*, simd_native/*, index/hnsw/*Cosine, Euclidean, DotProduct, Hamming, Jaccard
Graph enginecrates/velesdb-core/src/collection/graph/*Nodes/edges/traversal/property indexes
Multi-column filteringcrates/velesdb-core/src/column_store/*Typed filters + bitmap paths
Hybrid execution / fusioncrates/velesdb-core/src/collection/search/query/*Pushdown + fusion strategies
Storage + WAL/recoverycrates/velesdb-core/src/storage/*mmap storage and recovery tests
  • Operations runbook: docs/reference/OPERATIONS_RUNBOOK.md