README.md

March 4, 2026 ยท View on GitHub


diffly โ€” A utility package for comparing ๐Ÿปโ€โ„๏ธ DataFrames

CI conda-forge pypi-version python-version codecov

๐Ÿ—‚ Table of Contents

๐Ÿ“– Introduction

Diffly is a Python package for comparing Polars DataFrames with detailed analysis capabilities. It identifies differences between datasets including schema differences, row-level mismatches, missing rows, and column value changes.

๐Ÿ’ฟ Installation

You can install diffly using your favorite package manager:

pixi add diffly
conda install diffly
uv add diffly
pip install diffly

๐ŸŽฏ Usage

import polars as pl
from diffly import compare_frames

left = pl.DataFrame({
    "id": ["a", "b", "c"],
    "value": [1.0, 2.0, 3.0],
})

right = pl.DataFrame({
    "id": ["a", "b", "d"],
    "value": [1.0, 2.5, 4.0],
})

comparison = compare_frames(left, right, primary_key="id")

if not comparison.equal():
    summary = comparison.summary(
        top_k_column_changes=1,
        show_sample_primary_key_per_change=True
    )
    print(summary)
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ                                     Diffly Summary                                     โ”ƒ
โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›
   Primary key: id

 Schemas
 โ–”โ–”โ–”โ–”โ–”โ–”โ–”
   Schemas match exactly (column count: 2).

 Rows
 โ–”โ–”โ–”โ–”
   Left count             Right count
       3      (no change)      3

   โ”โ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”“
   โ”ƒ-โ”‚-โ”‚-โ”‚-โ”‚-โ”ƒ                1  left only   (33.33%)
   โ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ”โ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”“โ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•ฎ
   โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ = โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ  1  equal       (50.00%)  โ”‚
   โ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”œโ•ด  2  joined
   โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ โ‰  โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ  1  unequal     (50.00%)  โ”‚
   โ”—โ”โ”ทโ”โ”ทโ”โ”ทโ”โ”ทโ”โ”›โ•Œโ•Œโ•Œโ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•ฏ
                 โ”ƒ+โ”‚+โ”‚+โ”‚+โ”‚+โ”ƒ  1  right only  (33.33%)
                 โ”—โ”โ”ทโ”โ”ทโ”โ”ทโ”โ”ทโ”โ”›

 Columns
 โ–”โ–”โ–”โ–”โ–”โ–”โ–”
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ value โ”‚ 50.00% โ”‚ 2.0 -> 2.5 (1x, e.g. "b") โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

See more examples in the documentation.