tpchgen-rs
August 26, 2025 ยท View on GitHub
Blazing fast TPCH benchmark data generator, in pure Rust with zero dependencies.
Features
- Blazing Speed ๐
- Obsessively Tested ๐
- Fully parallel, streaming, constant memory usage ๐ง
Try it now
The easiest way to use this software is via the tpchgen-cli tool.
Performance
tpchgen-cli is more than 10x faster than the next fastest TPCH generator we
know of. On a 2023 Mac M3 Max laptop, it easily generates data faster than can
be written to SSD. See BENCHMARKS.md for more
details on performance and benchmarking.
Times to create TPCH tables in Parquet format using tpchgen-cli and duckdb for various scale factors.
| Scale Factor | tpchgen-cli | DuckDB | DuckDB (proprietary) |
|---|---|---|---|
| 1 | 0:02.24 | 0:12.29 | 0:10.68 |
| 10 | 0:09.97 | 1:46.80 | 1:41.14 |
| 100 | 1:14.22 | 17:48.27 | 16:40.88 |
| 1000 | 10:26.26 | N/A (OOM) | N/A (OOM) |
- DuckDB (proprietary) is the time required to create TPCH data using the proprietary DuckDB format
- Creating Scale Factor 1000 using DuckDB required 647 GB of memory, which is why it is not included in the table above.

Answers
The core tpchgen crate provides answers for queries 1 to 22 and for a scale factor
of 1. The answers exposed were derived from the TPC-H Tools
official distribution.
Testing
This crate has extensive tests to ensure correctness and produces exactly the
same, byte-for-byte output as the original dbgen implementation. We compare
the output of this crate with dbgen as part of every checkin. See
TESTING.md for more details on testing methodology
Crates
-
tpchgen: the core data generator logic for TPC-H. It has no dependencies and is easy to embed in other Rust project. -
tpchgen-arrowgenerates TPC-H data in Apache Arrow format. It depends on the arrow-rs library -
tpchgen-cliis adbgencompatible CLI tool that generates benchmark dataset using multiple processes.
Contributing
Pull requests are welcome. For major changes, please open an issue first for discussion. See our contributors guide for more details.
Architecture
Please see architecture guide for details on how the code is structured.
License
The project is licensed under the APACHE 2.0 license.