dynamical.org reformatters
April 19, 2026 ยท View on GitHub
Reformat weather datasets into zarr.
Browse the datasets produced by this repo at https://dynamical.org/catalog/.
- See AGENTS.md for an overview of the approach and this repository.
- Integrate a new dataset to be reformatted.
- Add a new variable to an existing dataset.
Local development
We use
uvto manage dependencies and python environmentsrufffor linting and formattingtyfor type checkingpytestfor testingprekto automatically lint and format as you git commit
Setup
- Install uv
- Run
uv run prek installto setup the git hooks - If you use VSCode, you may want to install the extensions (ruff) it will recommend when you open this folder
Running locally
uv run main --help- list all datasetsuv run main <DATASET_ID> update-templateuv run main <DATASET_ID> backfill-local <APPEND_DIM_END>
Development commands
- Add dependency:
uv add <package> [--dev]. Use--devto add a development only dependency. - Lint:
uv run ruff check [--fix] - Type check:
uv run ty check - Format:
uv run ruff format - Tests:
- Run tests in parallel on all available cores:
uv run pytest - Run tests serially:
uv run pytest -n 0
- Run tests in parallel on all available cores:
Deploying to the cloud
To reformat a large archive we parallelize work across multiple cloud servers.
We use
dockerto package the code and dependencieskubernetesindexed jobs to run work in parallel
Setup
- Install
dockerandkubectl. Make suredockercan be found at/usr/bin/dockerandkubectlat/usr/bin/kubectl. - Setup a docker image repository and export the
DOCKER_REPOSITORYenvironment variable in your local shell. e.g.export DOCKER_REPOSITORY=container.registry/<project-id>/reformatters/main. Follow your registry's instructions to allow your docker to authenticate and push images to the registry. - Setup a kubernetes cluster and configure kubectl to point to your cluster. e.g.
aws eks update-kubeconfig --region <region> --name <cluster-name>,gcloud container clusters get-credentials <cluster-name> --region <region> --project <project>, etc. - Create a kubectl secret containing a single json encoded value to be passed to fsspec
storage_optionsor splatted as keyword arguments to an icechunk storage openerkubectl create secret generic your-destination-storage-options-key --from-literal=contents='{"key": "...", "secret": "..."}'. Seestorage.py.