Similarity Scoring Benchmarks

January 18, 2026 ยท View on GitHub

Benchmarks for string similarity and alignment algorithms across Rust and Python implementations, including CPU and GPU variants.

Overview

Edit Distance calculation is a common component of Search Engines, Data Cleaning, and Natural Language Processing, as well as in Bioinformatics. It's a computationally expensive operation, generally implemented using dynamic programming, with a quadratic time complexity upper bound. For biological sequences, the Needleman-Wunsch and Smith-Waterman algorithms are more appropriate, as they allow overriding the default substitution costs. Each of those has two flavors - with linear and affine gap penalties, also known as the "Gotoh" variation.

Performance is measured in MCUPS (Million Cell Updates Per Second).

Levenshtein Distance

Library~100 bytes lines~1,000 bytes lines
Rust
bio::levenshtein on 1x SPR428 MCUPS823 MCUPS
rapidfuzz::levenshtein<Bytes> on 1x SPR4,633 MCUPS14,316 MCUPS
rapidfuzz::levenshtein<Chars> on 1x SPR3,877 MCUPS13,179 MCUPS
stringzillas::LevenshteinDistances on 1x SPR3,315 MCUPS13,084 MCUPS
stringzillas::LevenshteinDistancesUtf8 on 1x SPR3,283 MCUPS11,690 MCUPS
stringzillas::LevenshteinDistances on 16x SPR29,430 MCUPS105,400 MCUPS
stringzillas::LevenshteinDistancesUtf8 on 16x SPR38,954 MCUPS103,500 MCUPS
stringzillas::LevenshteinDistances on RTX600032,030 MCUPS901,990 MCUPS
stringzillas::LevenshteinDistances on H10031,913 MCUPS925,890 MCUPS
stringzillas::LevenshteinDistances on B20032,960 MCUPS998,620 MCUPS
stringzillas::LevenshteinDistances on 384x GNR114,190 MCUPS3,084,270 MCUPS
stringzillas::LevenshteinDistancesUtf8 on 384x GNR103,590 MCUPS2,938,320 MCUPS
Python
nltk.edit_distance2 MCUPS2 MCUPS
jellyfish.levenshtein_distance81 MCUPS228 MCUPS
rapidfuzz.Levenshtein.distance108 MCUPS9,272 MCUPS
editdistance.eval89 MCUPS660 MCUPS
edlib.align82 MCUPS7,262 MCUPS
polyleven.levenshtein89 MCUPS3,887 MCUPS
stringzillas.LevenshteinDistances on 1x SPR53 MCUPS3,407 MCUPS
stringzillas.LevenshteinDistancesUTF8 on 1x SPR57 MCUPS3,693 MCUPS
cudf.edit_distance batch on H10024,754 MCUPS6,976 MCUPS
stringzillas.LevenshteinDistances batch on 1x SPR2,343 MCUPS12,141 MCUPS
stringzillas.LevenshteinDistances batch on 16x SPR3,762 MCUPS119,261 MCUPS
stringzillas.LevenshteinDistances batch on H10018,081 MCUPS320,109 MCUPS

Needleman-Wunsch (Global Alignment)

Library~100 bytes lines~1,000 bytes lines
Rust
bio::pairwise::global on 1x SPR51 MCUPS57 MCUPS
stringzillas::NeedlemanWunschScores on 1x SPR278 MCUPS612 MCUPS
stringzillas::NeedlemanWunschScores on 16x SPR4,057 MCUPS8,492 MCUPS
stringzillas::NeedlemanWunschScores on 384x GNR64,290 MCUPS331,340 MCUPS
stringzillas::NeedlemanWunschScores on H100131 MCUPS12,113 MCUPS
Python
biopython.PairwiseAligner.score on 1x SPR95 MCUPS557 MCUPS
stringzillas.NeedlemanWunschScores on 1x SPR30 MCUPS481 MCUPS
stringzillas.NeedlemanWunschScores batch on 1x SPR246 MCUPS570 MCUPS
stringzillas.NeedlemanWunschScores batch on 16x SPR3,103 MCUPS9,208 MCUPS
stringzillas.NeedlemanWunschScores batch on H100127 MCUPS12,246 MCUPS

Smith-Waterman (Local Alignment)

Library~100 bytes lines~1,000 bytes lines
Rust
bio::pairwise::local on 1x SPR49 MCUPS50 MCUPS
stringzillas::SmithWatermanScores on 1x SPR263 MCUPS552 MCUPS
stringzillas::SmithWatermanScores on 16x SPR3,883 MCUPS8,011 MCUPS
stringzillas::SmithWatermanScores on 384x GNR58,880 MCUPS285,480 MCUPS
stringzillas::SmithWatermanScores on H100143 MCUPS12,921 MCUPS
Python
biopython.PairwiseAligner.score on 1x SPR95 MCUPS557 MCUPS
stringzillas.SmithWatermanScores on 1x SPR28 MCUPS440 MCUPS
stringzillas.SmithWatermanScores batch on 1x SPR255 MCUPS582 MCUPS
stringzillas.SmithWatermanScores batch on 16x SPR3,535 MCUPS8,235 MCUPS
stringzillas.SmithWatermanScores batch on H100130 MCUPS12,702 MCUPS

Needleman-Wunsch-Gotoh (Affine Gap Penalties)

Library~100 bytes lines~1,000 bytes lines
Rust
stringzillas::NeedlemanWunschScores on 1x SPR83 MCUPS354 MCUPS
stringzillas::NeedlemanWunschScores on 16x SPR1,267 MCUPS4,694 MCUPS
stringzillas::NeedlemanWunschScores on 384x GNR42,050 MCUPS155,920 MCUPS
stringzillas::NeedlemanWunschScores on H100128 MCUPS13,799 MCUPS

Smith-Waterman-Gotoh (Local with Affine Gaps)

Library~100 bytes lines~1,000 bytes lines
Rust
stringzillas::SmithWatermanScores on 1x SPR79 MCUPS284 MCUPS
stringzillas::SmithWatermanScores on 16x SPR1,026 MCUPS3,776 MCUPS
stringzillas::SmithWatermanScores on 384x GNR38,430 MCUPS129,140 MCUPS
stringzillas::SmithWatermanScores on H100127 MCUPS13,205 MCUPS

See README.md for dataset information and replication instructions.