DuckDB GTFS importer
March 5, 2026 · View on GitHub
This tool imports GTFS Schedule data into a DuckDB database using gtfs-via-duckdb. It allows running a production service (e.g. an API) on top of programmatically re-imported data from a periodically changing GTFS feed without downtime.
Tip
This is a clone of postgis-gtfs-importer, please refer to its docs for more information on how duckdb-gtfs-importer works.
All postgis-gtfs-importer environment variables (e.g. $GTFS_DOWNLOAD_URL or $GTFS_IMPORTER_DB_PREFIX) should be supported, except the PostgreSQL-specific ones.
Usage
Configure duckdb-gtfs-importer's behaviour using environment variables; Refer to the source code for more details. The most important variables are:
GTFS_DOWNLOAD_URL: The URL to the GTFS dataset. Will be downloaded usingcurl-mirror.GTFS_DOWNLOAD_USER_AGENT: Sent asUser-Agent. Please set something meaningful to help the server's operators.GTFSTIDY_BEFORE_IMPORT: If the GTFS dataset should begtfsclean-ed before import. Its behaviour can be further customised by severalGTFSTIDY_…variables (refer to the source code).GTFS_IMPORTER_DB_PREFIX: The "base name" for the GTFS DuckDBs created. You need this if you want to import more than one different GTFS feed. For example, withGTFS_IMPORTER_DB_PREFIX=vbb, they will be namedvbb_$timestamp_$hash.gtfs.duckdbandvbb.gtfs.duckdb.
With Docker
Mount two directories into the container:
- the one containing the final imported GTFS DuckDB at
/var/gtfs - the one containing temporary files (e.g. downloaded GTFS datasets) at
/tmp/gtfs
mkdir gtfs
mkdir gtfs-tmp
docker run --rm -it \
-v $PWD/gtfs:/var/gtfs \
-v $PWD/gtfs-tmp:/tmp/gtfs \
-e 'GTFS_DOWNLOAD_USER_AGENT=…' \
-e 'GTFS_DOWNLOAD_URL=…' \
-e 'GTFS_IMPORTER_VERBOSE=false' \
-e 'GTFSTIDY_BEFORE_IMPORT=false' \
ghcr.io/opendatavbb/duckdb-gtfs-importer
Without Docker
The following tools need to be in your $PATH:
- the
taskCLI - DuckDB's
duckdbCLI curl-mirror, which needs Node.jsgtfs-via-duckdb, which needs Node.jsunzip,sha256sum,touch,ln
Run duckdb-gtfs-importer using the task CLI:
task \
-t Taskfile.yml \
-d path/to/gtfs/dir
Windows
While Task uses a portable shell and "core utils" on Windows to mimick the behaviour of UNIX/GNU tools, there are subtle but important differences between the shims and their real counterpart. Therefore, duckdb-gtfs-importer will not work flawlessly on platforms like Windows.
For example, duckdb-gtfs-importer makes use of touch's -h/--no-dereference flag, which does not exist in the touch used by Task.
Related
- postgis-gtfs-importer – Imports GTFS data into a PostGIS database, using gtfsclean & gtfs-via-postgres.