ZXC: High-Performance Asymmetric Lossless Compression

April 25, 2026 · View on GitHub

ZXC is a high-performance, lossless, asymmetric compression library optimized for Content Delivery and Embedded Systems (Game Assets, Firmware, App Bundles). It is designed to be "Write Once, Read Many" (WORM). Unlike codecs like LZ4, ZXC trades compression speed (build-time) for maximum decompression throughput (run-time).

ZXC runs on all major architectures (x86_64, ARM64, ARMv7, ARMv6, RISC-V, POWER (ppc64el), s390x, i386) with hand-tuned SIMD paths (AVX2/AVX-512 on x86_64, NEON on ARMv8+). It shows especially strong gains on modern ARM cores (Apple Silicon, AWS Graviton, Google Axion) thanks to a bitstream layout tuned for their deep pipelines.

TL;DR

What: A C library for lossless compression, optimized for maximum decompression speed.
Key Result: Up to >40% faster decompression than LZ4 on Apple Silicon, >20% faster on Google Axion (ARM64), >10% faster on x86_64 (AMD EPYC), all with better compression ratios. Cross-platform by design, with particularly strong results on ARMv8+.
Use Cases: Game assets, firmware, app bundles, anything compressed once, decompressed millions of times.
Seekable: Built-in seek table for O(1) random-access decompression, load any block without scanning the entire file.
Install: conan install --requires="zxc/[*]" · vcpkg install zxc · brew install zxc · pip install zxc-compress · cargo add zxc-compress · npm i zxc-compress
Quality: Fuzzed, sanitized, formally tested, thread-safe API. BSD-3-Clause.

Independently Verified: ZXC has been officially merged into both major open-source compression benchmark suites:

lzbench (master branch, by @inikep)

TurboBench (master branch, by @powturbo)

You can reproduce these results independently using either industry-standard benchmark, alongside 70+ other codecs.

ZXC Design Philosophy

Traditional codecs often force a trade-off between symmetric speed (LZ4) and archival density (Zstd).

ZXC focuses on Asymmetric Efficiency.

Designed for the "Write-Once, Read-Many" reality of software distribution, ZXC utilizes a computationally intensive encoder to generate a bitstream specifically structured to maximize decompression throughput. By performing heavy analysis upfront, the encoder produces a layout optimized for the instruction pipelining and branch prediction capabilities of modern CPUs, particularly ARMv8, effectively offloading complexity from the decoder to the encoder.

Build Time: You generally compress only once (on CI/CD).
Run Time: You decompress millions of times (on every user's device). ZXC respects this asymmetry.

👉 Read the Technical Whitepaper

Benchmarks

To ensure consistent performance, benchmarks are automatically executed on every commit via GitHub Actions. We monitor metrics on both x86_64 (Linux) and ARM64 (Apple Silicon M2) runners to track compression speed, decompression speed, and ratios.

(See the latest benchmark logs)

1. Mobile & Client: Apple Silicon (M2)

Scenario: Game Assets loading, App startup.

Target	ZXC vs Competitor	Decompression Speed	Ratio	Verdict
1. Max Speed	ZXC -1 vs LZ4 --fast	12,195 MB/s vs 5,633 MB/s 2.16x Faster	61.6 vs 62.2 Smaller (-0.6%)	ZXC leads in raw throughput.
2. Standard	ZXC -3 vs LZ4 Default	7,008 MB/s vs 4,787 MB/s 1.46x Faster	46.4 vs 47.6 Smaller (-2.6%)	ZXC outperforms LZ4 in read speed and ratio.
3. High Density	ZXC -5 vs Zstd --fast 1	6,181 MB/s vs 2,527 MB/s 2.45x Faster	40.7 vs 41.0 Equivalent (-0.8%)	ZXC outperforms Zstd in decoding speed.

2. Cloud Server: Google Axion (ARM Neoverse V2)

Scenario: High-throughput Microservices, ARM Cloud Instances.

Target	ZXC vs Competitor	Decompression Speed	Ratio	Verdict
1. Max Speed	ZXC -1 vs LZ4 --fast	8,924 MB/s vs 4,950 MB/s 1.80x Faster	61.6 vs 62.2 Smaller (-0.6%)	ZXC leads in raw throughput.
2. Standard	ZXC -3 vs LZ4 Default	5,297 MB/s vs 4,262 MB/s 1.24x Faster	46.4 vs 47.6 Smaller (-2.6%)	ZXC outperforms LZ4 in read speed and ratio.
3. High Density	ZXC -5 vs Zstd --fast 1	4,676 MB/s vs 2,293 MB/s 2.04x Faster	40.7 vs 41.0 Equivalent (-0.8%)	ZXC outperforms Zstd in decoding speed.

3. Build Server: x86_64 (AMD EPYC 9B45)

Scenario: CI/CD Pipelines compatibility.

Target	ZXC vs Competitor	Decompression Speed	Ratio	Verdict
1. Max Speed	ZXC -1 vs LZ4 --fast	10,803 MB/s vs 5,312 MB/s 2.03x Faster	61.6 vs 62.2 Smaller (-0.6%)	ZXC achieves higher throughput.
2. Standard	ZXC -3 vs LZ4 Default	5,964 MB/s vs 5,050 MB/s 1.18x Faster	46.4 vs 47.6 Smaller (-2.6%)	ZXC offers improved speed and ratio.
3. High Density	ZXC -5 vs Zstd --fast 1	5,316 MB/s vs 2,445 MB/s 2.17x Faster	40.7 vs 41.0 Smaller (-0.8%)	ZXC provides faster decoding.

(Benchmark Graph ARM64 : Decompression Throughput & Storage Ratio (Normalized to LZ4))

Benchmark ARM64 (Apple Silicon M2)

Benchmarks were conducted using lzbench 2.2.1 (from @inikep), compiled with Clang 21.0.0 using MOREFLAGS="-march=native" on macOS Tahoe 26.4 (Build 25E246). The reference hardware is an Apple M2 processor (ARM64). All performance metrics reflect single-threaded execution on the standard Silesia Corpus and the benchmark made use of silesia.tar, which contains tarred files from the Silesia compression corpus.

Compressor name	Compression	Decompress.	Compr. size	Ratio	Filename
memcpy	52855 MB/s	52786 MB/s	211947520	100.00	1 files
zxc 0.10.0 -1	904 MB/s	12195 MB/s	130468706	61.56	1 files
zxc 0.10.0 -2	600 MB/s	10044 MB/s	114455432	54.00	1 files
zxc 0.10.0 -3	257 MB/s	7008 MB/s	98233034	46.35	1 files
zxc 0.10.0 -4	176 MB/s	6636 MB/s	91429653	43.14	1 files
zxc 0.10.0 -5	104 MB/s	6181 MB/s	86196446	40.67	1 files
lz4 1.10.0	792 MB/s	4787 MB/s	100880800	47.60	1 files
lz4 1.10.0 --fast -17	1316 MB/s	5633 MB/s	131732802	62.15	1 files
lz4hc 1.10.0 -9	46.3 MB/s	4531 MB/s	77884448	36.75	1 files
lzav 5.7 -1	625 MB/s	3859 MB/s	84644732	39.94	1 files
snappy 1.2.2	857 MB/s	3258 MB/s	101415443	47.85	1 files
zstd 1.5.7 --fast --1	704 MB/s	2527 MB/s	86916294	41.01	1 files
zstd 1.5.7 -1	625 MB/s	1776 MB/s	73193704	34.53	1 files
zlib 1.3.1 -1	145 MB/s	397 MB/s	77259029	36.45	1 files

Benchmark ARM64 (Google Axion Neoverse-V2)

Benchmarks were conducted using lzbench 2.2.1 (from @inikep), compiled with GCC 14.3.0 using MOREFLAGS="-march=native" on Linux 64-bits Debian GNU/Linux 12 (bookworm). The reference hardware is a Google Neoverse-V2 processor (ARM64). All performance metrics reflect single-threaded execution on the standard Silesia Corpus and the benchmark made use of silesia.tar, which contains tarred files from the Silesia compression corpus.

Compressor name	Compression	Decompress.	Compr. size	Ratio	Filename
memcpy	23971 MB/s	23953 MB/s	211947520	100.00	1 files
zxc 0.10.0 -1	810 MB/s	8924 MB/s	130468706	61.56	1 files
zxc 0.10.0 -2	523 MB/s	7461 MB/s	114455432	54.00	1 files
zxc 0.10.0 -3	246 MB/s	5297 MB/s	98233034	46.35	1 files
zxc 0.10.0 -4	170 MB/s	5038 MB/s	91429653	43.14	1 files
zxc 0.10.0 -5	100 MB/s	4676 MB/s	86196446	40.67	1 files
lz4 1.10.0	731 MB/s	4262 MB/s	100880800	47.60	1 files
lz4 1.10.0 --fast -17	1278 MB/s	4950 MB/s	131732802	62.15	1 files
lz4hc 1.10.0 -9	43.3 MB/s	3850 MB/s	77884448	36.75	1 files
lzav 5.7 -1	554 MB/s	2776 MB/s	84644732	39.94	1 files
snappy 1.2.2	757 MB/s	2298 MB/s	101415443	47.85	1 files
zstd 1.5.7 --fast --1	606 MB/s	2293 MB/s	86916294	41.01	1 files
zstd 1.5.7 -1	524 MB/s	1645 MB/s	73193704	34.53	1 files
zlib 1.3.1 -1	57.2 MB/s	390 MB/s	77259029	36.45	1 files

Benchmark x86_64 (AMD EPYC 9B45)

Benchmarks were conducted using lzbench 2.2.1 (from @inikep), compiled with GCC 14.3.0 using MOREFLAGS="-march=native" on Linux 64-bits Ubuntu 24.04. The reference hardware is an AMD EPYC 9B45 processor (x86_64). All performance metrics reflect single-threaded execution on the standard Silesia Corpus and the benchmark made use of silesia.tar, which contains tarred files from the Silesia compression corpus.

Compressor name	Compression	Decompress.	Compr. size	Ratio	Filename
memcpy	26487 MB/s	26572 MB/s	211947520	100.00	1 files
zxc 0.10.0 -1	787 MB/s	10803 MB/s	130468706	61.56	1 files
zxc 0.10.0 -2	503 MB/s	9536 MB/s	114455432	54.00	1 files
zxc 0.10.0 -3	244 MB/s	5964 MB/s	98233034	46.35	1 files
zxc 0.10.0 -4	169 MB/s	5660 MB/s	91429653	43.14	1 files
zxc 0.10.0 -5	100 MB/s	5316 MB/s	86196446	40.67	1 files
lz4 1.10.0	761 MB/s	5050 MB/s	100880800	47.60	1 files
lz4 1.10.0 --fast -17	1281 MB/s	5312 MB/s	131732802	62.15	1 files
lz4hc 1.10.0 -9	45.6 MB/s	4864 MB/s	77884448	36.75	1 files
lzav 5.7 -1	595 MB/s	3615 MB/s	84644732	39.94	1 files
snappy 1.2.2	774 MB/s	2140 MB/s	101512076	47.89	1 files
zstd 1.5.7 --fast --1	663 MB/s	2445 MB/s	86916294	41.01	1 files
zstd 1.5.7 -1	605 MB/s	1896 MB/s	73193704	34.53	1 files
zlib 1.3.1 -1	134 MB/s	401 MB/s	77259029	36.45	1 files

Benchmark x86_64 (AMD EPYC 7763)

Benchmarks were conducted using lzbench 2.2.1 (from @inikep), compiled with GCC 14.2.0 using MOREFLAGS="-march=native" on Linux 64-bits Ubuntu 24.04. The reference hardware is an AMD EPYC 7763 64-Core processor (x86_64). All performance metrics reflect single-threaded execution on the standard Silesia Corpus and the benchmark made use of silesia.tar, which contains tarred files from the Silesia compression corpus.

Compressor name	Compression	Decompress.	Compr. size	Ratio	Filename
memcpy	22410 MB/s	22392 MB/s	211947520	100.00	1 files
zxc 0.10.0 -1	601 MB/s	6921 MB/s	130468706	61.56	1 files
zxc 0.10.0 -2	388 MB/s	5787 MB/s	114455432	54.00	1 files
zxc 0.10.0 -3	186 MB/s	3903 MB/s	98233034	46.35	1 files
zxc 0.10.0 -4	130 MB/s	3738 MB/s	91429653	43.14	1 files
zxc 0.10.0 -5	80.4 MB/s	3565 MB/s	86196446	40.67	1 files
lz4 1.10.0	582 MB/s	3551 MB/s	100880800	47.60	1 files
lz4 1.10.0 --fast -17	1015 MB/s	4102 MB/s	131732802	62.15	1 files
lz4hc 1.10.0 -9	33.3 MB/s	3407 MB/s	77884448	36.75	1 files
lzav 5.7 -1	416 MB/s	2647 MB/s	84644732	39.94	1 files
snappy 1.2.2	613 MB/s	1593 MB/s	101512076	47.89	1 files
zstd 1.5.7 --fast --1	448 MB/s	1626 MB/s	86916294	41.01	1 files
zstd 1.5.7 -1	409 MB/s	1221 MB/s	73193704	34.53	1 files
zlib 1.3.1 -1	98.5 MB/s	328 MB/s	77259029	36.45	1 files

Installation

Option 1: Download Release (GitHub)

Go to the Releases page.
Download the archive matching your architecture:

macOS:
- zxc-macos-arm64.tar.gz (NEON optimizations included).
Linux:
- zxc-linux-aarch64.tar.gz (NEON optimizations included).
- zxc-linux-x86_64.tar.gz (Runtime dispatch for AVX2/AVX512).
Windows:
- zxc-windows-x64.zip (Runtime dispatch for AVX2/AVX512).
- zxc-windows-arm64.zip (NEON optimizations included).

Extract and install:

tar -xzf zxc-linux-x86_64.tar.gz -C /usr/local

Each archive contains:

bin/zxc                          # CLI binary
include/                         # C headers (zxc.h, zxc_buffer.h, ...)
lib/libzxc.a                     # Static library
lib/pkgconfig/libzxc.pc          # pkg-config support
lib/cmake/zxc/zxcConfig.cmake    # CMake find_package(zxc) support

Use in your project:

CMake:

find_package(zxc REQUIRED)
target_link_libraries(myapp PRIVATE zxc::zxc_lib)

pkg-config:

cc myapp.c $(pkg-config --cflags --libs libzxc) -o myapp

Option 2: vcpkg

Classic mode:

vcpkg install zxc

Manifest mode (add to vcpkg.json):

{
  "dependencies": ["zxc"]
}

Then in your CMake project:

find_package(zxc CONFIG REQUIRED)
target_link_libraries(myapp PRIVATE zxc::zxc_lib)

Option 3: Conan

You also can download and install zxc using the Conan package manager:

    conan install -r conancenter --requires="zxc/[*]" --build=missing

Or add to your conanfile.txt:

[requires]
zxc/[*]

The zxc package in Conan Center is kept up to date by ConanCenterIndex contributors. If the version is out of date, please create an issue or pull request on the Conan Center Index repository.

Option 4: Homebrew

brew install zxc

The formula is maintained in homebrew-core.

Option 5: Building from Source

Requirements: CMake (3.14+), C17 Compiler (Clang/GCC/MSVC).

git clone https://github.com/hellobertrand/zxc.git
cd zxc
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --parallel

# Run tests
ctest --test-dir build -C Release --output-on-failure

# CLI usage
./build/zxc --help

# Install library, headers, and CMake/pkg-config files
sudo cmake --install build

CMake Options

Option	Default	Description
`BUILD_SHARED_LIBS`	OFF	Build shared libraries instead of static (`libzxc.so`, `libzxc.dylib`, `zxc.dll`)
`ZXC_NATIVE_ARCH`	ON	Enable `-march=native` for maximum performance
`ZXC_ENABLE_LTO`	ON	Enable Link-Time Optimization (LTO)
`ZXC_PGO_MODE`	OFF	Profile-Guided Optimization mode (`OFF`, `GENERATE`, `USE`)
`ZXC_BUILD_CLI`	ON	Build command-line interface
`ZXC_BUILD_TESTS`	ON	Build unit tests
`ZXC_ENABLE_COVERAGE`	OFF	Enable code coverage generation (disables LTO/PGO)
`ZXC_DISABLE_SIMD`	OFF	Disable hand-written SIMD paths (AVX2/AVX512/NEON)

# Build shared library
cmake -B build -DBUILD_SHARED_LIBS=ON

# Portable build (without -march=native)
cmake -B build -DZXC_NATIVE_ARCH=OFF

# Library only (no CLI, no tests)
cmake -B build -DZXC_BUILD_CLI=OFF -DZXC_BUILD_TESTS=OFF

# Code coverage build
cmake -B build -DZXC_ENABLE_COVERAGE=ON

# Disable explicit SIMD code paths (compiler auto-vectorisation is unaffected)
cmake -B build -DZXC_DISABLE_SIMD=ON

Profile-Guided Optimization (PGO)

PGO uses runtime profiling data to optimize branch layout, inlining decisions, and code placement.

Step 1 - Build with instrumentation:

cmake -B build -DCMAKE_BUILD_TYPE=Release -DZXC_PGO_MODE=GENERATE
cmake --build build --parallel

Step 2 - Run a representative workload to collect profile data:

# Run the test suite (exercises all block types and compression levels)
./build/zxc_test

# Or compress/decompress representative data
./build/zxc -b your_data_file

Step 3 - (Clang only) Merge raw profiles:

# Clang generates .profraw files that must be merged before use
llvm-profdata merge -output=build/pgo/default.profdata build/pgo/*.profraw

GCC uses a directory-based format and does not require this step.

Step 4 - Rebuild with profile data:

cmake -B build -DCMAKE_BUILD_TYPE=Release -DZXC_PGO_MODE=USE
cmake --build build --parallel

Packaging Status

Compression Levels

Level 1, 2 (Fast): Optimized for real-time assets (Gaming, UI).
Level 3, 4 (Balanced): A strong middle-ground offering efficient compression speed and a ratio superior to LZ4.
Level 5 (Compact): The best choice for Embedded, Firmware, or Archival. Better compression than LZ4 and significantly faster decoding than Zstd.

Block Size Tuning

The default block size is 256 KB, a conservative choice that balances compression quality, memory usage, and random-access granularity. For bulk/archival workloads where maximum throughput matters, 512 KB blocks are recommended.

Why larger blocks help: Each block starts with a cold hash table, so the LZ match-finder has no history and produces more literals until the table warms up. Doubling the block size halves the number of cold-start penalties, improving both ratio and decompression speed.

Block Size	Memory (per context)	Ratio (level -3)	Decompression gain vs 256 KB
256 KB (default)	~1.7 MB	46.36%	—
512 KB	~3.3 MB	45.81% (−0.55 pp)	+1% to +8% depending on CPU

# CLI
zxc -B 512K -5 input_file output_file

# API
zxc_compress_opts_t opts = {
    .level      = ZXC_LEVEL_COMPACT,
    .block_size = 512 * 1024,
};

Guideline: Use 256 KB (default) for streaming, embedded, or memory-constrained environments. Use 512 KB for bulk compression pipelines, CI/CD asset packaging, and high-throughput servers.

Usage

1. CLI

The CLI is perfect for benchmarking or manually compressing assets.

# Basic Compression (Level 3 is default)
zxc -z input_file output_file

# High Compression (Level 5)
zxc -z -5 input_file output_file

# Seekable Archive (enables O(1) random-access decompression)
zxc -z -S input_file output_file

# -z for compression can be omitted
zxc input_file output_file

# as well as output file; it will be automatically assigned to input_file.zxc
zxc input_file

# Decompression
zxc -d compressed_file output_file

# Benchmark Mode (Testing speed on your machine)
zxc -b input_file

Using with `tar`

ZXC works as a drop-in external compressor for tar (reads stdin, writes stdout, returns 0 on success):

# GNU tar (Linux)
tar -I 'zxc -5' -cf archive.tar.zxc data/
tar -I 'zxc -d' -xf archive.tar.zxc

# bsdtar (macOS)
tar --use-compress-program='zxc -5' -cf archive.tar.zxc data/
tar --use-compress-program='zxc -d' -xf archive.tar.zxc

# Pipes (universal)
tar cf - data/ | zxc > archive.tar.zxc
zxc -d < archive.tar.zxc | tar xf -

2. API

ZXC provides a thread-safe API with two usage patterns. Parameters are passed through dedicated options structs, making call sites self-documenting and forward-compatible.

Buffer API (In-Memory)

#include "zxc.h"

// Compression
uint64_t bound = zxc_compress_bound(src_size);
zxc_compress_opts_t c_opts = {
    .level            = ZXC_LEVEL_DEFAULT,
    .checksum_enabled = 1,
    /* .block_size = 0 -> 256 KB default */
};
int64_t compressed_size = zxc_compress(src, src_size, dst, bound, &c_opts);

// Decompression
zxc_decompress_opts_t d_opts = { .checksum_enabled = 1 };
int64_t decompressed_size = zxc_decompress(src, src_size, dst, dst_capacity, &d_opts);

Stream API (Files, Multi-Threaded)

#include "zxc.h"

// Compression (auto-detect threads, level 3, checksum on)
zxc_compress_opts_t c_opts = {
    .n_threads        = 0,               // 0 = auto
    .level            = ZXC_LEVEL_DEFAULT,
    .checksum_enabled = 1,
    /* .block_size = 0 -> 256 KB default */
};
int64_t bytes_written = zxc_stream_compress(f_in, f_out, &c_opts);

// Decompression
zxc_decompress_opts_t d_opts = { .n_threads = 0, .checksum_enabled = 1 };
int64_t bytes_out = zxc_stream_decompress(f_in, f_out, &d_opts);

Reusable Context API (Low-Latency / Embedded)

For tight loops (e.g. filesystem plug-ins) where per-call malloc/free overhead matters, use opaque reusable contexts. Options are sticky - settings from zxc_create_cctx() are reused when passing NULL:

#include "zxc.h"

zxc_compress_opts_t opts = { .level = 3, .checksum_enabled = 0 };
zxc_cctx* cctx = zxc_create_cctx(&opts);   // allocate once, settings remembered
zxc_dctx* dctx = zxc_create_dctx();        // allocate once

// reuse across many blocks - NULL reuses sticky settings:
int64_t csz = zxc_compress_cctx(cctx, src, src_sz, dst, dst_cap, NULL);
int64_t dsz = zxc_decompress_dctx(dctx, dst, csz, out, src_sz, NULL);

zxc_free_cctx(cctx);
zxc_free_dctx(dctx);

Features:

Caller-allocated buffers with explicit bounds
Thread-safe (stateless)
Configurable block sizes (4 KB – 2 MB, powers of 2)
Multi-threaded streaming (auto-detects CPU cores)
Optional checksum validation
Reusable contexts for high-frequency call sites
Seekable archives: optional seek table for O(1) random-access decompression (.seekable = 1)

See complete examples and advanced usage ->

Language Bindings

Official wrappers maintained in this repository:

Language	Package Manager	Install Command	Documentation	Author
Rust	`crates.io`	`cargo add zxc-compress`	README	@hellobertrand
Python	`PyPI`	`pip install zxc-compress`	README	@nuberchardzer1
Node.js	`npm`	`npm install zxc-compress`	README	@hellobertrand
Go	`go get`	`go get github.com/hellobertrand/zxc/wrappers/go`	README	@hellobertrand
WASM	Build from source	`emcmake cmake -B build-wasm && cmake --build build-wasm`	README	@hellobertrand

Community-maintained bindings:

Language	Package Manager	Install Command	Repository	Author
Go	pkg.go.dev	`go get github.com/meysam81/go-zxc`	https://github.com/meysam81/go-zxc	@meysam81
Nim	nimble	`nimble install zxc`	https://github.com/openpeeps/zxc-nim	@georgelemon
Free Pascal	Build from source	Clone the repository	https://github.com/Xelitan/Free-Pascal-port-of-ZXC-compressor-decompressor	@Xelitan

Safety & Quality

Unit Tests: Comprehensive test suite with CTest integration.
Continuous Fuzzing: Integrated with ClusterFuzzLite suites.
Static Analysis: Checked with Cppcheck & Clang Static Analyzer.
CodeQL Analysis: GitHub Advanced Security scanning for vulnerabilities.
Code Coverage: Automated tracking with Codecov integration.
Dynamic Analysis: Validated with Valgrind and ASan/UBSan in CI pipelines.
Safe API: Explicit buffer capacity is required for all operations.

License & Credits

Third-Party Components:

rapidhash by Nicolas De Carli (MIT) - Used for high-speed, platform-independent checksums.