🌶️ sracha 🌶️

June 18, 2026 · View on GitHub

Fast SRA downloader and FASTQ converter, written in pure Rust.

sracha demo

Features

Fast -- 5-12x faster than fasterq-dump on typical SRA files
One command -- download, convert to FASTQ, and compress
Batch input -- accessions, BioProjects (PRJNA), studies (SRP), or a file via --accession-list
gzip or zstd output -- parallel compression, or plain FASTQ
FASTA output -- --fasta drops quality scores
SRA and SRA-lite -- full or simplified quality scores
Split modes -- split-3, split-files, split-spot, interleaved
Resumable downloads -- picks up where it left off
Stdout streaming -- -Z pipes FASTQ straight into downstream tools
Integrity checks -- MD5 verification on download and decode
Platform support -- Illumina, BGISEQ/DNBSEQ, Element, Ultima, PacBio, Nanopore (legacy 454 and Ion Torrent are not supported)
Single static binary -- no Python, no C dependencies

Quick start

# Download, convert, and compress
sracha get SRR28588231

# Download all runs from a BioProject
sracha get PRJNA675068

# Batch download from an accession list
sracha get --accession-list SRR_Acc_List.txt

# Just download
sracha fetch SRR28588231

# Convert a local .sra file
sracha fastq SRR28588231.sra

# Show accession info
sracha info SRR28588231

# Validate a downloaded file
sracha validate SRR28588231.sra

Benchmarks

Local decode (SRA file on disk → FASTQ)

Uncompressed output, measured with hyperfine.

File	Size	sracha	fasterq-dump	fastq-dump	Speedup vs fasterq-dump
SRR28588231	23 MiB	0.15 s	1.78 s	1.94 s	12.3x
SRR2584863	288 MiB	1.07 s	5.53 s	12.99 s	5.2x
ERR1018173	1.94 GiB	6.45 s	33.41 s	--	5.2x

sracha produces gzipped FASTQ by default (level 1, ~1.4× the uncompressed time on small files thanks to parallel block compression), so the integrated pipeline (sracha get) writes ready-to-use .fastq.gz without a separate gzip step.

Full hyperfine output

SRR28588231 (23 MiB, 66K spots, Illumina paired)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`sracha`	145.1 ± 2.3	141.7	149.8	1.00
`fasterq-dump`	1782.6 ± 11.7	1769.3	1794.4	12.28 ± 0.21
`fastq-dump`	1942.0 ± 3.6	1938.0	1945.6	13.38 ± 0.22

SRR2584863 (288 MiB, Illumina paired)

Command	Mean [s]	Min [s]	Max [s]	Relative
`sracha`	1.070 ± 0.006	1.064	1.076	1.00
`fasterq-dump`	5.526 ± 0.081	5.441	5.602	5.16 ± 0.08
`fastq-dump`	12.989 ± 0.031	12.967	13.025	12.14 ± 0.07

ERR1018173 (1.94 GiB, 15.6M spots, Illumina paired, single run)

Command	Time [s]
`sracha`	6.45
`fasterq-dump`	33.41

sracha gzip overhead (SRR28588231, default --gzip-level 1)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`sracha (no compression)`	150.4 ± 2.5	145.7	154.4	1.00
`sracha (gzip)`	213.7 ± 2.8	208.9	218.3	1.42 ± 0.03

Benchmarks run with sracha v0.3.8, sra-tools v3.4.1, on Linux (8 CPUs). Install the reference toolkit with pixi run install-sratools and reproduce with validation/benchmark.sh.

Installation

Install via Bioconda:

pixi add bioconda::sracha

Or download pre-built binaries from the releases page, or install from source:

cargo install --git https://github.com/rnabioco/sracha-rs sracha

Containers

Because sracha is on Bioconda, BioContainers automatically publishes a Docker/Singularity image for every release — no local build required.

# Docker / Podman
docker run --rm quay.io/biocontainers/sracha:0.3.7--h54198d6_0 sracha --help

# Singularity / Apptainer
singularity run \
  https://depot.galaxyproject.org/singularity/sracha:0.3.7--h54198d6_0 sracha --help

The tags above are examples — check quay.io for the latest <version>--<build> tag and substitute it in.

In Nextflow, point a process at the image directly or let the conda directive resolve it:

process SRACHA_GET {
    container 'quay.io/biocontainers/sracha:0.3.7--h54198d6_0'
    // or: conda 'bioconda::sracha=0.3.7'
    // ...
}

Documentation

Full CLI reference and usage guide: https://rnabioco.github.io/sracha-rs/

Acknowledgments

sracha builds on the Sequence Read Archive, maintained by the National Center for Biotechnology Information at the National Library of Medicine. The SRA and its toolchain are public-domain software developed by U.S. government employees — our tax dollars at work. Special thanks to Kenneth Durbrow (@durbrow) and the SRA Toolkit team for building and maintaining the infrastructure that makes projects like this possible.

This project wouldn't exist without NCBI's open infrastructure: the VDB/KAR format, the SDL locate API, EUtils, and public S3 hosting of sequencing data. sracha aims to make it easier for the community to build on that foundation.

License

MIT