Clodius
March 25, 2026 ยท View on GitHub
Displaying large amounts of data often requires first turning it into not-so-large amounts of data. Clodius is a program and library designed to aggregate large datasets to make them easy to display at different resolutions.
Demo
Install the clodius package:
pip install clodius
And use it aggregate a bigWig file:
curl https://raw.githubusercontent.com/hms-dbmi/clodius/develop/test/sample_data/geneAnnotationsExonsUnions.short.bed \
> /tmp/sample.short.bed
clodius aggregate bedfile /tmp/sample.short.bed
The output files can then be displayed using higlass-manage. For more information about viewing these types of files take a look at the higlass docs.
More examples are available.
File Types
- Non-genomic Rasters
- Genomic Data
Development
The recommended way to develop clodius is to use a conda environment and
install clodius with develop mode:
pip install -e ".[dev]"
Test Fixtures (Git LFS)
Test data files in data/ are stored in Git LFS. They are downloaded automatically when you clone the repository with LFS enabled:
git lfs install # once per machine
git clone <repo> # LFS files downloaded automatically
# or, in an existing clone:
git lfs pull
Adding a new test fixture
-
Check if the file type is already tracked โ open .gitattributes and look for a matching pattern (e.g.
data/*.gz,*.bam). If not, add a new tracking rule:git lfs track "data/*.ext" # adds a line to .gitattributes git add .gitattributes -
Allow the file through
.gitignoreโdata/*is ignored by default. Add a negation line for your file:!data/your_new_file.ext -
Stage and commit as normal:
git add data/your_new_file.ext git commit -m "Add test fixture: your_new_file.ext" git push # LFS objects are uploaded automatically
Testing
The unit tests for clodius can be run using pytest:
pytest
Individual unit tests can be specified by indicating the file and function they are defined in:
pytest test/cli_test.py:test_clodius_aggregate_bedgraph