README.md
March 4, 2026 ยท View on GitHub
๐ Table of Contents
๐ Introduction
Diffly is a Python package for comparing Polars DataFrames with detailed analysis capabilities. It identifies differences between datasets including schema differences, row-level mismatches, missing rows, and column value changes.
๐ฟ Installation
You can install diffly using your favorite package manager:
pixi add diffly
conda install diffly
uv add diffly
pip install diffly
๐ฏ Usage
import polars as pl
from diffly import compare_frames
left = pl.DataFrame({
"id": ["a", "b", "c"],
"value": [1.0, 2.0, 3.0],
})
right = pl.DataFrame({
"id": ["a", "b", "d"],
"value": [1.0, 2.5, 4.0],
})
comparison = compare_frames(left, right, primary_key="id")
if not comparison.equal():
summary = comparison.summary(
top_k_column_changes=1,
show_sample_primary_key_per_change=True
)
print(summary)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Diffly Summary โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Primary key: id
Schemas
โโโโโโโ
Schemas match exactly (column count: 2).
Rows
โโโโ
Left count Right count
3 (no change) 3
โโโฏโโฏโโฏโโฏโโ
โ-โ-โ-โ-โ-โ 1 left only (33.33%)
โ โโผโโผโโผโโผโโจโโโโโโฏโโฏโโฏโโฏโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ โ โ โ โ โ = โ โ โ โ โ โ 1 equal (50.00%) โ
โ โโผโโผโโผโโผโโจโโโโ โโผโโผโโผโโผโโจโโโโโโโโโโโโโโโโโโโโโโโโโโโโโด 2 joined
โ โ โ โ โ โ โ โ โ โ โ โ โ 1 unequal (50.00%) โ
โโโทโโทโโทโโทโโโโโโ โโผโโผโโผโโผโโจโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โ+โ+โ+โ+โ+โ 1 right only (33.33%)
โโโทโโทโโทโโทโโ
Columns
โโโโโโโ
โโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ value โ 50.00% โ 2.0 -> 2.5 (1x, e.g. "b") โ
โโโโโโโโโดโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
See more examples in the documentation.