NumKong for JavaScript

March 21, 2026 · View on GitHub

NumKong's JavaScript package provides vector kernels for Node and Bun-style runtimes, targeting the space between handwritten loops over TypedArrays and larger tensor frameworks. It keeps the JS surface intentionally compact: dense distances, dot products, binary metrics, probability divergences, packed GEMM-like matrix multiplication, symmetric Gram matrices, dtype-tagged low-precision storage, typed views, and runtime capability inspection.

Quickstart

import { dot } from "numkong";

const a = new Float32Array([1, 2, 3]);
const b = new Float32Array([4, 5, 6]);
console.log(dot(a, b)); // 32

Highlights

This SDK is deliberately smaller than Python or Rust. Its job is to make the hot vector kernels easy to use from modern JavaScript runtimes.

TypedArray-first API. Standard Float32Array, Float64Array, Int8Array, and Uint8Array work directly. DType tags for exotic storage. f16, bf16, fp8, and packed bits stay explicit. Owned and borrowed views. Vector, VectorView, and base tensor wrappers preserve dtype metadata. Portable runtime story. The same package can target native addons and WASM runtimes. No fake tensor-framework scope. This binding stays centered on the vector families it actually exports.

Ecosystem Comparison

Feature	NumKong	mathjs	tensorflow.js
Operation families	dots, distances, binary, probability, cast, packed, symmetric	general arithmetic, matrix ops, statistics	matmul, elementwise, reductions
Precision	BFloat16 through sub-byte; automatic widening; Kahan summation	Float64 only; standard accuracy	Float32 primarily; no sub-byte; standard accuracy
Runtime SIMD dispatch	auto-selects best ISA per-thread across x86, ARM, RISC-V	none; pure JS	fixed at build time via WASM SIMD or WebGL
Packed matrix, GEMM-like	`dotsPack` + `dotsPacked`; persistent packing; amortized	`math.multiply` — no persistent packing	`tf.matMul` — no persistent packing
Symmetric kernels, SYRK-like	`dotsSymmetric`; upper triangle only; row-range partitioning	no duplicate-pair skipping	no duplicate-pair skipping
WASM fallback	yes — portable, runs in browser without native addon	yes — pure JS, no native required	yes — also WebGL/WebGPU
Bundle size	small	moderate	large

Installation

The package targets Node >= 22.

npm install numkong
yarn add numkong
pnpm add numkong
bun add numkong

If you build from source, the package uses node-gyp-build on install and TypeScript sources under javascript/.

Browser and WASM

The npm package includes a pre-built WASM bundle under wasm/. The simplest way to use it in a browser is via a CDN — no build step required:

<script type="module">
  import { dot, euclidean } from 'https://cdn.jsdelivr.net/npm/numkong@7/wasm/numkong.js';

  const query = new Float32Array([1.0, 2.0, 3.0]);
  const doc = new Float32Array([4.0, 5.0, 6.0]);
  console.log(dot(query, doc));          // 32 — SIMD-accelerated, client-side
  console.log(euclidean(query, doc));    // 5.196...
</script>

For self-hosted WASM, download the binaries from a GitHub Release and serve them from the same directory:

<script type="module">
  import * as numkong from "./numkong-emscripten.js";
  import NumKongModule from "./numkong.js";

  const wasm = await NumKongModule();
  numkong.initWasm(wasm);

  const a = new Float32Array([1, 2, 3]);
  const b = new Float32Array([4, 5, 6]);
  console.log(numkong.dot(a, b));
</script>

Or import the subpath from a bundler or Node.js (without the native addon):

import { dot } from "numkong/wasm";

Dot Products

Dot products are separate from distances because dtype tagging and low-precision storage matter more here.

import { dot, inner } from "numkong";

const a = new Float32Array([1, 2, 3]);
const b = new Float32Array([4, 5, 6]);

console.log(dot(a, b));
console.log(inner(a, b)); // alias for ecosystem familiarity

For non-native numeric layouts, pass an explicit DType or wrap the storage in a typed NumKong view.

Dense Distances

Dense distance entrypoints work directly on the standard numeric TypedArray types.

import { sqeuclidean, euclidean, angular } from "numkong";

const a = new Float32Array([1, 2, 3]);
const b = new Float32Array([4, 5, 6]);

console.log(sqeuclidean(a, b)); // equivalent shape to a manual sum((a[i] - b[i]) ** 2)
console.log(euclidean(a, b));
console.log(angular(a, b));

When the storage type is one of the standard JS typed arrays, dtype inference is automatic.

Binary Metrics

Binary metrics operate on packed storage rather than on generic boolean arrays.

import { toBinary, hamming, jaccard } from "numkong";

const a = toBinary(new Float32Array([1, -2, 3, -4, 5, -6, 7, -8]));
const b = toBinary(new Float32Array([1, 2, -3, -4, 5, 6, -7, -8]));

console.log(hamming(a, b));
console.log(jaccard(a, b));

This is a natural model for sign-quantized embeddings and semantic hashes.

Probability Metrics

import { kullbackleibler, jensenshannon } from "numkong";

const p = new Float32Array([0.2, 0.3, 0.5]);
const q = new Float32Array([0.1, 0.3, 0.6]);

console.log(kullbackleibler(p, q));
console.log(jensenshannon(p, q));

These call the underlying SIMD divergence kernels directly.

DType Tags and Low-Precision Arrays

JavaScript has no built-in f16, bf16, fp8, or packed-bit numeric model. NumKong handles that with explicit dtype tags and wrapper arrays.

The supported low-precision types and their bit layouts are:

Float16Array: 1 sign + 5 exponent + 10 mantissa bits, 2 bytes per element, range ±65504, supports Inf and NaN
BFloat16Array: 1 sign + 8 exponent + 7 mantissa bits, 2 bytes per element, full f32 dynamic range, supports Inf and NaN
E4M3Array: 1 sign + 4 exponent + 3 mantissa bits, 1 byte per element, range ±448, no Inf, NaN is 0x7F only
E5M2Array: 1 sign + 5 exponent + 2 mantissa bits, 1 byte per element, range ±57344, supports Inf and NaN
BinaryArray: packed bits in bytes, 8 elements per byte

Use .byteLength for the exact payload size.

import { Float16Array, E4M3Array, DType, dot, angular } from "numkong";

const a16 = new Float16Array([1, 2, 3]);
const b16 = new Float16Array([4, 5, 6]);
console.log(dot(a16, b16, DType.F16));
console.log(a16.byteLength);

const a8 = new E4M3Array([1, 2, 3]);
const b8 = new E4M3Array([4, 5, 6]);
console.log(angular(a8, b8, DType.E4M3));
console.log(a8.byteLength);

If the underlying storage is a raw Uint16Array or Uint8Array, JS itself cannot know whether those bytes mean integers, f16, bf16, mini-floats, or packed bits. That is exactly when the DType tag becomes mandatory. You can also pass it to cast to convert many values between all supported types:

numkong.cast(f32Source, "f32", bf16Dest, "bf16");
numkong.cast(f32Source, "f32", bf16Dest, DType.E4M3);

Vector Views and Owned Buffers

The wrapper hierarchy exists to keep dtype and ownership explicit across addon and WASM boundaries.

TensorBase carries buffer, byteOffset, and dtype
VectorBase adds rank-1 semantics
VectorView is a zero-copy borrowed wrapper over existing memory
Vector owns its ArrayBuffer

import { VectorView, Vector, DType, dot } from "numkong";

const raw = new Uint16Array([0x3c00, 0x4000, 0x4200]); // f16 payload, not uint16 values
const view = VectorView.from(raw, DType.F16);

const owned = new Vector(3, DType.F32);
owned.toTypedArray().set(new Float32Array([4, 5, 6]));

console.log(dot(view, view));
console.log(dot(owned, owned));
console.log(owned.byteLength);

Use VectorView when the bytes already live somewhere else. Use Vector when NumKong should own the storage.

Capabilities and Runtime Selection

Capability detection is explicit:

import { Capability, getCapabilities, hasCapability } from "numkong";

const caps = getCapabilities();

console.log(caps);
console.log(hasCapability(Capability.HASWELL));
console.log(hasCapability(Capability.NEON));

The exact bitmask depends on whether you are running the native addon or a WASM runtime.

There is no configure_thread call in the JS binding. Thread configuration is managed internally by the native addon or WASM runtime.

Native Addon and WASM Runtimes

The top-level package is native-first. It loads the compiled addon through node-gyp-build.

The repository also ships WASM wrappers. Those are useful for portable or sandboxed environments. They are not feature-complete replacements for the native SDKs.

Practical guidance:

Use the native addon for lower host-call latency.
Use the WASM path when portability matters more than absolute latency.
Keep your expectations scoped to the vector and matrix-oriented API that this binding actually exports.