tsvkit

April 8, 2026 · View on GitHub

tsvkit is a fast, ergonomic toolkit for working with TSV tables. Written in Rust, it brings familiar data-wrangling verbs (join, cut, filter, mutate, summarize, reshape, slice, pretty-print) to the command line with consistent column selection, rich expressions, and streaming-friendly performance. The CLI is inspired by projects such as csvtk, csvkit, datamash, awk, xsv, and mlr, and many options are intentionally compatible with csvtk so existing users can adapt quickly.

Overview
- Key features
Installation
Sample data
Quick start pipeline
Command overview
Core concepts
Command reference
- info
- cut
- filter
- join
- mutate
- summarize
- sort
- melt
- pivot
- slice
- head
- pretty
- transpose
- excel
- csv
Additional tips

Overview

tsvkit combines versatile column selection with an expression engine for statistics, filtering, and data transformation. This makes it straightforward to join multiple files and select column from each file to generate data matrix (e.g. gene count table), filter row based on selected columns, compute multi-column summaries, or pipe TSV/Excel data through complex workflows without leaving the shell. Multi-sheet Excel workbooks are supported alongside .tsv, .tsv.gz, and .tsv.xz files.

Key features

Stream-friendly processing; every command reads from files or standard input and writes to standard output.
Column selectors that accept names, 1-based indices, ranges, regexes, and multi-file specifications.
Expression language with arithmetic, comparisons, logical operators, list membership, regex matching, and numeric helper functions.
Aggregations for grouped summaries (summarize) and row-wise calculations (mutate).
Excel tooling to inspect, preview, export, and assemble .xlsx workbooks.

Installation

cargo build --release
# binary available at target/release/tsvkit

Run any command with --help to see detailed usage and concrete examples:

tsvkit --help
tsvkit join --help

Sample data

Curated example tables live under examples/ and power the walkthroughs below.

File	Description
`abundance.tsv`	Mock metagenomic counts with taxonomic kingdoms for quick pivot/stack demos.
`cytokines.tsv`	Wide cytokine panel with one row per sample.
`expression.tsv`	Long-form gene expression measurements (`sample_id`, `gene`, `expression`).
`metadata.tsv`	Minimal ID → group lookup for join/filter illustrations.
`profiles.tsv`	Wide expression profiles suitable for melt/pivot walkthroughs.
`qc.tsv`	Sequencing QC metrics (reads, mapped percentage, duplication rate).
`samples.tsv`	Sample-level metrics (purity, yields, contamination) for six RNA-seq libraries.
`scores.tsv`	Tidy student scores table for summarize/group-by coverage.
`subjects.tsv`	Subject demographics linked to samples via `subject_id`.
`bioinfo_example.xlsx`	Two-sheet workbook (`Samples`, `Cytokines`) built from the TSVs above for the Excel tooling.

Quick start pipeline

Join sample and subject metadata, derive a total cytokine score, filter high-purity case samples, and pretty-print the result:

cat examples/cytokines.tsv \
  | tsvkit mutate -e 'total=sum($IL6:$IL10)' -e 'log_total=log2($total)' \
  | tsvkit join -f 'sample_id;sample_id' -k 0 - examples/samples.tsv \
  | tsvkit filter -e '$group == "case" & $purity >= 0.94' \
  | tsvkit cut -f 'sample_id:IL10,log_total:purity' \
  | tsvkit pretty

Tip: wrap every -e expression in single quotes so your shell keeps $column selectors intact. Inside an expression, always prefix column references with $ (e.g. $total, $1).

Output

+-----------+-----+-----+------+------+-----------+------------+-------+-----------+--------+
| sample_id | IL6 | TNF | IFNG | IL10 | log_total | subject_id | group | timepoint | purity |
+-----------+-----+-----+------+------+-----------+------------+-------+-----------+--------+
| S01       | 4.2 | 3.1 | 6.8  | 2.4  | 4.044394  | P001       | case  | baseline  | 0.94   |
| S03       | 4.9 | 3.6 | 7.4  | 2.6  | 4.209453  | P001       | case  | week4     | 0.96   |
+-----------+-----+-----+------+------+-----------+------------+-------+-----------+--------+

The same pipeline works if the cytokine table is compressed (examples/cytokines.tsv.gz).

Command overview

The list below provides a one-line description of every tsvkit subcommand. Each item links to the detailed section later in this guide.

info — inspect table shape, inferred column types, and sample values.
cut — select or reorder columns via names, indices, or ranges.
filter — keep rows matching an expression (math, logic, regex, functions).
join — join multiple TSVs on key columns with parallel loading and fill values.
mutate — create or overwrite columns using expressions, aggregates, and string helpers.
summarize — group rows and compute aggregates (mean, median, quantiles, etc.).
sort — sort rows by one or more keys with numeric/text modifiers.
melt — convert wide tables into tidy long form with variable/value pairs.
pivot — convert long form back to wide with optional fill value for missing cells.
slice — extract rows by 1-based indices or ranges.
head — preview the first rows as a boxed table (single or multiple files).
pretty — render aligned, boxed tables for quick inspection or sharing.
transpose — transpose rows and columns.
excel — inspect, preview, export, or build .xlsx workbooks.
csv — convert delimited text to TSV with custom separators.

Core concepts

These conventions appear across the toolkit; understanding them once makes each subcommand predictable.

Column selectors

Selectors are reused in cut, filter, join, mutate, summarize, and others.

Pattern	Meaning	Example
`name`	Column by header name.	`sample_id,purity`
`index`	1-based column index.	`1,4,9`
`-index`	Column counted from the end (1 = last).	`-1,-2`
`start:end`	Inclusive range by name or index. Supports open ends.	`IL6:IL10`, `2:5`, `:IL10`, `IL6:`
`~"regex"`	Columns whose names match the regular expression. Requires headers.	`~"^sample_"`
`:`	Select every column in order.	`-f ':'`
`mixed`	Combine names, indices, ranges, and regexes.	`sample_id,3:5,~"_pct$"`
`multi-file`	Separate selectors for each input with semicolons (primarily `join`).	`sample_id;subject_id`
`range in expressions`	Prefixed with `$` to access a slice of values.	`$IL6:$IL10`

Wrap selectors in backticks or braces to treat punctuation literally. For example, -f '`IL6:IL10`,`total,reads`' or -f '{IL6:IL10},{total,reads}' selects columns named IL6:IL10 and total,reads instead of expanding a range or splitting on the comma.

Negative indices are also valid inside ranges: :-2 selects every column except the final two, while -3: keeps the last three columns. Regex selectors deduplicate by first match; add --allow-dups (or -D) on cut/summarize when you need repeated columns.

Regex selectors require a header row. When -H/--no-header is active, using ~"..." results in an error with guidance to remove the regex or restore headers.

Anywhere you access column values inside an expression, prefix the selector with $ ($purity, $1, $IL6:$IL10).

Streaming and file handling

Every command accepts files or - (stdin) and auto-detects .tsv, .tsv.gz, and .tsv.xz inputs.
Add -H/--no-header when your data lacks a header row; selectors fall back to 1-based indices.
Use -C/--comment-char, -E/--ignore-empty-row, and -I/--ignore-illegal-row on any subcommand to control how input lines are filtered before processing.
tsvkit join parallelizes input loading; control the worker count with -t/--threads (defaults to the lesser of 8 and the available CPU cores).
--fill TEXT lets join, melt, pivot (and others) swap empty cells for a custom placeholder.

Expression language essentials

The same expression language powers filter -e, mutate -e name=EXPR, and regex substitutions. Wrap expressions in single quotes to protect $columns from the shell.

Operators and comparisons

Symbol / keyword	Description	Works on
`+ - * / ^`	Arithmetic operators (`^` is exponentiation, right-associative).	Numbers
`== != < <= > >=`	Comparisons.	Numbers or strings
`&` / `and`	Logical AND.	Booleans
`\|` / `or`	Logical OR.	Booleans
`!` / `not`	Logical negation.	Booleans
`~`	Regex match. Right-hand side can be literal text or a `$range`.	Strings
`!~`	Regex does not match.	Strings
`in`	Membership test against a list literal or numeric range.	`$group in ["case","control"]`
`!in`	Negated membership test.	`$status !in ["fail","missing"]`

Reference columns whose names contain operators or punctuation with ${column-name} inside expressions (e.g. ${dna-} - $rna_ug). This prevents the parser from treating the characters as arithmetic.

List literals use square brackets: [1,2,3], ["case","control"], [IL6:IL10]. Combine them with in/!in to test membership, or pass them to helper functions that accept lists.

Numeric helper functions

Function	Description
`abs(expr)`	Absolute value
`sqrt(expr)`	Square root
`exp(expr)` / `exp2(expr)`	Exponential (`e^x`) / base-2 exponential
`ln(expr)`	Natural logarithm
`log(expr)` / `log10(expr)`	Base-10 logarithm
`log2(expr)`	Base-2 logarithm
`len(expr)`	Character count using Unicode code points.
`is_na(expr)`	Returns `1` when the expression is blank/`NA`/`NaN`, otherwise `0`.
`upper(expr)`	Convert text to uppercase.
`lower(expr)`	Convert text to lowercase.
`cap(expr)`	Capitalize only the first character.

Functions accept column references (abs($purity - 1)), constants, or subexpressions. Empty or non-numeric values yield blanks.

Conditional and regex helpers

case_when(condition -> result, ..., _ -> default) evaluates each boolean condition in order and returns the matching result. The final _ branch acts as the default.
switch(value, [match1, match2] -> result, ..., _ -> default) compares value to one or more literal matches (strings or numbers) and returns the corresponding result.
re(value, pattern) evaluates a regex against value, returning 1 or 0. When the pattern matches, capture groups become available as $1, $2, etc. for the remainder of the expression (use $0-style numeric selectors sparingly when you rely on captures).

Example:

case_when(
    re($sample, "^ERR(\\d+)$") -> \$1,
    re($sample, "^SRR")          -> "SRA",
    _                              -> $sample
)

Row-wise aggregation helpers

Available within mutate expressions via functions such as sum($col1:$col5); see the Mutate section for the full list.

Command reference

Each subsection highlights the core options, shows realistic invocations, and calls out relevant selectors or expressions.

`info`

Get a quick, structured summary of any TSV: the overall shape plus one row per column with inferred types and sample values. The preview column defaults to the first three rows, but you can raise or lower it with -n (e.g. -n 5). Combine with -H when the input has no header row so the summary omits the name column.

tsvkit info examples/samples.tsv

Output

#shape(6, 9)
index   name            type    first3
1       sample_id       str     [S01, S02, S03]
2       subject_id      str     [P001, P002, P001]
3       group           str     [case, control, case]
4       timepoint       str     [baseline, baseline, week4]
5       purity          num     [0.94, 0.90, 0.96]
6       dna_ug          num     [25.3, 22.8, 27.4]
7       rna_ug          num     [18.1, 17.5, 19.8]
8       contamination_pct       num     [0.02, 0.03, 0.01]
9       tech            str     [sRNA-seq, sRNA-seq, sRNA-seq]

`cut`

Select or reorder columns by name, index, or range.

tsvkit cut -f 'sample_id,group,purity,tech' examples/samples.tsv

Ranges expand consecutive columns automatically:

tsvkit cut -f 'sample_id,IL6:IL10' examples/cytokines.tsv

Regex selectors pick up columns whose headers match a pattern. Combine them with names, indices, and ranges in any order:

tsvkit cut -f '1,group,~"^IL",~"_pct$"' examples/qc.tsv

Injecting the source file basename (__base__) or filename with path (__file__):

tsvkit cut -f '__base__,1:' examples/qc*.tsv

Explanation:

__base__ injects the source filename as the first column
1: selects all existing columns from each input file

You can now inject file-derived values directly in -f with template selectors:

tsvkit cut -f '{file},{base:},1:2' examples/qc.tsv
tsvkit cut -f 'sample={base:#sample_!lower},1:' sample_A.tsv

Use --inject-col-names (aliases: --file-col, --fc) to rename injected columns. If you have multiple injected selectors in -f, pass comma-separated names in order:

tsvkit cut --inject-col-names sample -f '__base__,1:' examples/qc*.tsv
tsvkit cut --inject-col-names file_name,sample -f '{base:},sample={base:#sample_!upper},1:2' sample_A.tsv

Template tokens:

{file} / {__file__} full path
{base} / {__base__} basename
{dir} / {__dir__} parent dir
{base:} basename without all extensions
{base.} basename without last extension
{file%} basename of {file}
{file/} directory of {file}
{file^suffix} remove a literal trailing suffix when present
{base:#prefix} remove a literal prefix when present (example: {base:#sample_})
case controls: append !upper, !lower, or !cap (for example {base:!upper})

Shell tip: ! is interpreted by many shells in double quotes. Use single quotes for selector/template expressions like '{base:!lower},1:3' or 'sample={base:!lower},1:'. If you must use double quotes, escape it: "{base:\!lower}".

Negative selectors in cut -f:

-1 = last column
-2 = second-last column
-2: = from second-last to the final column

Matches deduplicate by default; add -D/--allow-dups to keep every occurrence when multiple selectors target the same column.

Useful operational flags:

-H/--no-header for headerless TSVs (selectors are index-based)
-C/--comment-char to skip comment lines (default #)
-E/--ignore-empty-row to skip blank rows
-I/--ignore-illegal-row to skip rows with inconsistent column counts

`filter`

Filter rows with boolean logic, arithmetic, column ranges, regexes, and list membership tests.

tsvkit filter -e '$group == "case" & $purity >= 0.94' examples/samples.tsv

tsvkit filter -e '$status !in ["fail","missing","error"] & $tech ~ "sRNA"' examples/samples.tsv

Case helpers are supported in filter expressions:

tsvkit filter -e 'cap(\$1) == "HELLO"' data.tsv
tsvkit filter -e 'upper($group) == "CASE"' examples/samples.tsv

Expression building blocks for filter

Building block	Examples	Notes
Column values	`$purity`, `$1`, `$IL6:$IL10`	Ranges produce a list; use in regex matches or aggregate helpers.
Literals	`1.25`, `"case"`	Strings use double quotes; escape inner quotes with `\"`.
Arithmetic	`($rna_ug - $dna_ug) / $rna_ug`	Standard precedence applies (parentheses for clarity).
Comparisons	`$purity >= 0.9`, `$group != "control"`	Works on numeric or string data.
Logical	`($purity >= 0.9) & ($group == "case")`	`&`, `\|`, and `!` (or `and`, `or`, `not`).
Numeric functions	`log2($total)`, `sqrt($reads)`	See Expression language essentials.
Row-wise aggregators	`sum($dna_ug:$rna_ug)`, `mode(\$1,\$3)`, `countunique($gene:)`	Same catalog as `summarize`: totals, quantiles (`q` / `p`), variance/SD, products, entropy, argmin/argmax, membership stats. Works with ranges, lists, and open selectors.
Regex match	`$tech ~ "sRNA"`, `$notes !~ "(?i)fail"`	Patterns follow Rust `regex` syntax. `(?i)` enables case-insensitive matching.
Regex across ranges	`$gene:$notes ~ "kinase"`, `~ "control"`	When the left-hand side is omitted, `~` scans all columns.
Membership	`$group in ["case","control"]`, `$rank in [1:3]`	Right-hand side must be a list literal or numeric range.

Regex usage at a glance

Pattern	Description
`$col ~ "^ABC"`	Keep rows where the column starts with `ABC`.
`$col !~ "xyz$"`	Exclude rows where the column ends with `xyz`.
`$A:$C ~ "kinase"`	Match if any column in the range contains `kinase`.
`~ "(?i)na"`	Match if any column (entire row) contains `na`, case-insensitive.

`join`

Merge tables on shared keys. Provide selectors with -f/--fields; when all inputs use the same key you can specify it once.

tsvkit join -f subject_id examples/samples.tsv examples/subjects.tsv

Control join type with -k (-k 0 = full outer). Use -F/--select to specify output columns (defaults to all non-key columns); syntax mirrors -f. --fill TEXT supplies placeholders for missing combinations, while --sorted streams pre-sorted data. tsvkit join trims unused columns before indexing, and -t/--threads (default up to 8) balances throughput and resource usage.

Use --add-header to override emitted non-key header names with per-file/per-column templates:

tsvkit join \
  -f 'subject_id;subject_id' \
  -F 'group,purity;sex,age' \
  --add-header '{base:}_group,{base:}_purity;{base:}_sex,{base:}_age' \
  examples/samples.tsv examples/subjects.tsv

Formatting rules: split files with ;, columns with ,, and keep counts aligned with -F for each file. Template tokens are shared with cut -f '{...}' template selectors.

When using -H (no input header) together with --add-header, join emits a header row:

join-key columns are named index1, index2, ..., indexN by default
non-key columns use your --add-header templates.

Use --key-header (alias --index-name) to rename join-key columns explicitly. Provide comma-separated names and match the number of join columns.

Example:

tsvkit join -H \
  -f '1;1' \
  --key-header 'sample_id' \
  --add-header 'patient_{base:#sample_}' \
  sample_A.tsv sample_B.tsv

`mutate`

Create derived columns or rewrite values using expressions.

tsvkit mutate \
  -e 'total=sum($IL6:$IL10)' \
  -e 'log_total=log2($total)' \
  -e 'label=sub($sample_id,"S","Sample_")' \
  examples/cytokines.tsv

Use case_when, switch, and the re() helper for richer branching logic and regex capture reuse:

tsvkit mutate \
  -e 'label = case_when(
        re($sample, "^ERR(\d+)$") -> \$1,
        re($sample, "^SRR")        -> "SRA",
        _                            -> $sample
      )' \
  -e 'bucket = case_when(len($clean) == 0 -> "empty", len($clean) < 5 -> "short", _ -> "long")' \
  examples/samples.tsv

Apply in-place edits with the sed-style form:

tsvkit mutate -e 's/$group/ctrl/control/' examples/samples.tsv

Multiple expressions can be packed into one -e clause using ;:

tsvkit mutate -e 'v1=\$7/\$8;v2=\$11/\$12' data.tsv

Create new columns from regex replacement with:

tsvkit mutate -e 'new=s/\$2/aa[0-9]+/bb/' data.tsv
tsvkit mutate -e 'new2=${1/aa/bb}' data.tsv

String case helpers can be used directly in mutate expressions:

tsvkit mutate -e 'v1=cap(\$2)' -e 'v2=upper($group)' data.tsv

Mutation building blocks

Form	Meaning	Example
`name=EXPR`	Append a new column containing the evaluated expression.	`mean_signal=mean($sig1:$sig4)`
`existing=EXPR`	Overwrite an existing column with the expression result.	`purity=round($purity,2)` (via custom helper script)
`s/$selectors/pattern/replacement/`	Regex substitution on one or more columns (`$` optional).	`s/$group/ctrl/control/`
`new=s/$selector/pattern/replacement/`	Create a new column from one source column via regex replacement.	`new=s/\$2/aa/bb/`
`new=${selector/pattern/replacement}`	Braced shorthand for assignment substitution.	`new=${1/aa/bb}`

Row-wise aggregators shared by filter and mutate

Category	Functions (aliases)	Description
Totals & centers	`sum`, `mean`/`avg`, `median`/`med`, `trimmean`, `iqr`	Numeric summaries that ignore blanks; `iqr` computes `q3 - q1`.
Dispersion	`sd`/`std`/`stddev`, `var`/`variance`, `entropy`	Spread metrics and Shannon entropy of the value distribution.
Extremes & positions	`min`, `max`, `absmin`, `absmax`, `argmin`, `argmax`	`arg*` return the 1-based position among numeric entries.
Membership & counts	`count`, `first`, `last`, `rand`/`random`, `unique`, `collapse`, `countunique`/`distinct`, `mode`, `antimode`	Operate on the original strings (including duplicates and blanks).
Products	`prod`/`product`	Multiply all numeric inputs (skips blanks and NaNs).
Quantiles	`q` (`q1`, `q0.9`, `q_0_25`), `p` (`p95`, `p99.5`)	Fractions `0–1` and percents `0–100`; underscores may replace dots.

Aggregators accept any range, list, or open-ended selector (sum(\$1,\$3:)). Non-numeric cells are skipped for numeric summaries. Results are appended as new columns unless you assign them back to an existing name.

`summarize`

Group rows and compute descriptive statistics. Without -g/--group, the entire table is treated as a single group.

tsvkit summarize \
  -g group \
  -s 'purity=mean,sd' \
  -s 'dna_ug:contamination_pct=q1,q3' \
  examples/samples.tsv

Regex selectors work here as well, so you can summarize whole families of columns in one shot:

tsvkit summarize -g patient -s '~"^sample_"=mean,sd' -D cohort.tsv

Aggregators supported by summarize

Counts & membership

Aggregator (aliases)	Description	Output type
`count`	Number of rows in the group (ignores blanks).	Numeric
`first`	First non-empty value encountered.	Original type
`last`	Last non-empty value encountered.	Original type
`rand` / `random`	Random value from the group.	Original type
`unique`	Comma-separated list of distinct values in encounter order.	String
`collapse`	Concatenate every value (comma-separated, includes duplicates).	String
`countunique` / `distinct`	Count of distinct values.	Numeric
`mode`	Most frequent value (ties resolved by first occurrence).	Original type
`antimode`	Least frequent value (ties resolved by first occurrence).	Original type
`entropy`	Shannon entropy calculated from value frequencies.	Numeric

Numeric summaries

Aggregator (aliases)	Description
`sum`	Sum of numeric values.
`mean` / `avg`	Arithmetic mean.
`median` / `med`	Median (50th percentile).
`trimmean`	Mean of values after trimming 25% from each tail.
`iqr`	Interquartile range (`q3 - q1`).
`sd` / `std` / `stddev`	Sample standard deviation.
`var` / `variance`	Sample variance.
`min` / `max`	Minimum / maximum value.
`absmin` / `absmax`	Value with the smallest / largest absolute magnitude (returned with original sign).
`prod` / `product`	Product of numeric values.
`argmin` / `argmax`	1-based row index within the group where the min/max occurs.

Quantiles

Pattern	Description
`q1`, `q2`, `q3`, `q4`	Quartiles (`q2` equals the median).
`q0`, `q0.25`, `q0.75`, `q0.9`	Fractional quantiles between 0 and 1 (underscores allowed instead of dots).
`p0`, `p25`, `p95`, `p99.5`	Percentiles between 0 and 100.

Quantile aggregators accept any q* (fraction) or p* (percent) token. Values may include decimals (q0.05, p99.5) or integers. Non-numeric cells are ignored for numeric summaries and quantiles. absmin, absmax, mode, antimode, and entropy inspect the original string values, so they work even without numeric conversion.

`sort`

Sort rows by one or more keys. Modifiers: :n (numeric), :nr (numeric descending), :r (reverse text).

tsvkit sort -k purity:nr -k contamination_pct examples/samples.tsv

`melt`

Convert wide tables into tidy long form. Add --fill TEXT to substitute blanks with a chosen value.

tsvkit melt -i sample_id -v IL6:IL10 examples/cytokines.tsv

`pivot`

Promote long-form values to columns. -c/--column also accepts the short alias -f, and --fill TEXT sets a default value for missing combinations.

tsvkit pivot -i gene -c sample_id -v expression examples/expression.tsv

`slice`

Take specific rows (1-based indices or ranges, including open-ended forms like :10, 10:, or even : for everything). Negative indices count from the end, so :-2 emits every row except the final two and -3: keeps the last three rows.

tsvkit slice -r 1,4:5 examples/samples.tsv

`head`

Print the first rows from TSV input (default -n 10). With one input (including stdin), output is plain TSV with no file banner; with multiple files, each block is prefixed by # <file>.

tsvkit head -n 5 examples/samples.tsv examples/subjects.tsv
cat examples/samples.tsv | tsvkit head -n 3

`pretty`

Render aligned, boxed output for quick inspection.

tsvkit filter -e '$group == "case"' examples/samples.tsv | tsvkit pretty

--round DIGITS (or -r) rounds numeric cells to the requested precision. Tiny magnitudes automatically switch to scientific notation so columns stay legible even when values approach zero.

`transpose`

Transpose a table (rows become columns). With headers, the header row is included in transposition; use -H for headerless input.

tsvkit transpose examples/samples.tsv

`excel`

Inspect .xlsx workbooks, preview sheets, export ranges as TSV, or assemble new workbooks from TSV inputs. Unless -H/--no-header is supplied, the first row of each sheet is treated as the header row; use that flag when you need to preview or export raw rows.

List sheets (--sheets) with row/column counts and inferred column types:

tsvkit excel --sheets examples/bioinfo_example.xlsx

Output

1     Samples rows=21 cols=4  types=[string,string,mixed,mixed]
2     Variants        rows=21 cols=4  types=[string,mixed,mixed,mixed]
3     Expression      rows=21 cols=3  types=[string,mixed,mixed]
4     Pathways        rows=11 cols=3  types=[string,string,mixed]
5     QC      rows=21 cols=4  types=[string,mixed,mixed,mixed]
6     ClinMetadata    rows=21 cols=4  types=[string,mixed,string,string]
7     Taxonomy        rows=21 cols=4  types=[string,mixed,mixed,mixed]
8     Coverage        rows=11 cols=2  types=[string,mixed]
9     Proteomics      rows=21 cols=3  types=[string,mixed,mixed]
10    Metabolites     rows=21 cols=2  types=[string,mixed]

Preview (--preview) the first rows of every sheet (header + N rows by default). Use -s to focus on one sheet, -n to change the window, --formulas to show Excel formulas instead of values, --dates raw|excel|iso to control date rendering (iso is the default), and --pretty to render the preview with aligned borders. Add -H/--no-header when the sheet lacks a header row so the preview shows raw rows only:

tsvkit excel --preview reports.xlsx -n 5 -s Summary
tsvkit excel --preview reports.xlsx --formulas --dates raw
tsvkit excel --preview reports.xlsx --pretty
tsvkit excel --preview examples/bioinfo_example.xlsx -n 3 --pretty

Output (Only the first three sheets are shown below; the actual output has ten.)

#1 Samples
+----------+-------+--------+--------+
| SampleID | Group | Purity | DNA_ug |
+----------+-------+--------+--------+
| S001     | Case  | 0.744  | 3.05   |
| S002     | Case  | 0.837  | 8.59   |
| S003     | Case  | 0.703  | 1.15   |
+----------+-------+--------+--------+

#2 Variants
+----------+------+--------+------+
| SampleID | SNPs | Indels | CNVs |
+----------+------+--------+------+
| S001     | 1217 | 130    | 13   |
| S002     | 3191 | 241    | 19   |
| S003     | 2463 | 210    | 8    |
+----------+------+--------+------+

#3 Expression
+-------+-----------+--------------+
| Gene  | Expr_Case | Expr_Control |
+-------+-----------+--------------+
| Gene1 | 12.46     | 9.49         |
| Gene2 | 5.6       | 14.43        |
| Gene3 | 6.99      | 5.05         |
+-------+-----------+--------------+
...

Dump (--dump) a sheet (or subset) to TSV. Columns accept names, indices, or Excel letters/ranges (e.g. A:C,Expr, :C, C:). Rows accept 1-based indices or inclusive ranges (1,10:20,:25,100:). --na replaces blanks, --escape-* makes TSV-safe output, and the same --values/--formulas + --dates controls apply. Use -H/--no-header when the sheet lacks a header row so column names fall back to indices:
```
tsvkit excel --dump examples/bioinfo_example.xlsx -s Samples -f 'SampleID,Group,Purity' -r 2:3 | tsvkit pretty
```
Output
```
+----------+-------+--------+
| SampleID | Group | Purity |
+----------+-------+--------+
| S002     | Case  | 0.837  |
| S003     | Case  | 0.703  |
+----------+-------+--------+
```
Load (--load) one or more TSV files into a new workbook. Each TSV can be followed by -s SHEETNAME. Use -H when the TSV lacks headers, --fields to supply header names in that case, --types infer|string to control numeric inference, --dates excel|iso|raw to influence how date strings are written, --na to treat specific tokens as blanks, and --max-rows-per-sheet to split very tall sheets (defaults to Excel's 1,048,576 rows):
```
tsvkit excel --load examples/samples.tsv -s Samples --load examples/cytokines.tsv -s Cytokines -o examples/testout.xlsx
```

Only .xlsx files are supported at the moment. Sheets created via --load are renamed Name (2), Name (3), … when row limits force splits. When -s is omitted the sheet falls back to its 1-based position (1, 2, …) in the load order, so a mixture of named and unnamed inputs still yields deterministic sheet names.

`csv`

Convert delimited text into TSV. Use --delim to specify the input delimiter (default ,) and -H when the source has no header row. The converter also mirrors common TSV-reader switches:

-C/--comment-char skips comment lines.
-E/--ignore-empty-row drops blank lines.
-I/--ignore-illegal-row skips rows whose width differs from the header/first row.
--na STR substitutes a string for empty fields (left blank by default).
--lazy-quotes treats stray quotes literally rather than erroring out.

Compressed inputs work via the same auto-detection as other commands.

tsvkit csv examples/data.csv > examples/data.tsv
tsvkit csv examples/semicolon.csv --delim ';' -H > tmp.tsv

Additional tips

tsvkit automatically detects .tsv, .tsv.gz, and .tsv.xz. Pipe from curl/zcat for other formats.
Numeric functions treat empty cells as missing; regex syntax follows Rust's regex crate.
For massive joins, pre-sort inputs and use join --sorted to keep memory usage flat.
Combine mutate, filter, and summarize to build complete pipelines directly on the command line.

Contributing

Issues and pull requests are welcome!