README.md

September 8, 2021 ยท View on GitHub

Rustdoc simple-build-test

Rust library implementing xor filters

Implementation of Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters in rust-lang, Journal of Experimental Algorithmics (to appear).

This package is a port from its golang implementation.

How to use xorfilter in my rust project ?

Add the following under project's Cargo.toml:

[dependencies]
xorfilter-rs = "0.2.0"

or

[dependencies]
xorfilter-rs = { git = "https://github.com/bnclabs/xorfilter" }
use xorfilter::Xor8;

let mut keys: Vec<u64> = vec![];
for _ in 0..num_keys {
    keys.push(rng.gen());
}

let mut filter = Xor8::new(); // new filter.
filter.populate_keys(&keys); // populate keys.
filter.build(); // build bitmap.

for key in 0..lookup {
    // there can be false positives, but no false negatives.
    filter.contains_key(key);
}

Open issues

  • Serialize / Deserialize Xor8 type.
  • Incrementally adding keys to a pre-built Xor8 instance.
  • Gather benchmark results for other implementations - Go, C, C++, Erlang, Java, Python.

Benchmarks

Following are the results for a set of 10-million u64 keys:

build 10M keysmembershipFPPBits/Entry
Xor8-C1.206 secsNA0.389 %9.84 bits
Xor8-rust1.809 secs61.716 ns0.392 %9.84 bits
Fuse8-C0.508 secsNA0.390 %9.02 bits
Fuse8-rust0.577 secs42.657 ns0.392 %9.02 bits
Fuse16-C0.515 secsNA0.001 %18.04 bits
Fuse16-rust0.621 secs54.657 ns0.001 %18.03 bits
  • Build time is measured in Seconds, for 10 million entries.
  • Membership is measured in Nanosec, for single lookup in a set of 10 million entries.
  • FPP = False Positive Probability measured in percentage

Contribution

  • Simple workflow. Fork - Modify - Pull request.
  • Before creating a PR,
    • Run make build to confirm all versions of build is passing with 0 warnings and 0 errors.
    • Run check.sh with 0 warnings, 0 errors and all test-cases passing.
    • Run perf.sh with 0 warnings, 0 errors and all test-cases passing.
    • Install and run cargo spellcheck to remove common spelling mistakes.
  • Developer certificate of origin is preferred.