Sequence Operations Benchmarks

January 18, 2026 ยท View on GitHub

Benchmarks for string collection operations like sorting across Rust and Python implementations.

Overview

Rust has several Dataframe libraries, DBMS and Search engines that heavily rely on string sorting and intersections. Those operations mostly are implemented using conventional algorithms:

  • Comparison-based Quicksort or Mergesort for sorting.
  • Hash-based or Tree-based algorithms for intersections.

Assuming the compares can be accelerated with SIMD and so can be the hash functions, StringZilla could already provide a performance boost in such applications, but starting with v4 it also provides specialized algorithms for sorting and intersections. Those are directly compatible with arbitrary string-comparable collection types with a support of an indexed access to the elements.

String Sorting

LibraryShort WordsLong Lines
Rust
std::sort_unstable_by_key54.35 M compares/s57.70 M compares/s
rayon::par_sort_unstable_by_key on 1x SPR47.08 M compares/s50.35 M compares/s
polars::Series::sort200.34 M compares/s65.44 M compares/s
polars::Series::arg_sort25.01 M compares/s14.05 M compares/s
arrow::lexsort_to_indices122.20 M compares/s84.73 M compares/s
stringzilla::argsort_permutation213.73 M compares/s74.64 M compares/s
Python
list.sort on 1x SPR47.06 M compares/s22.36 M compares/s
pandas.Series.sort_values on 1x SPR9.39 M compares/s11.93 M compares/s
pyarrow.compute.sort_indices on 1x SPR62.17 M compares/s5.53 M compares/s
polars.Series.sort on 1x SPR223.38 M compares/s181.60 M compares/s
cudf.Series.sort_values on H1009'463.59 M compares/s66.44 M compares/s
stringzilla.Strs.sorted on 1x SPR171.13 M compares/s77.88 M compares/s

See README.md for dataset information and replication instructions.