🌶️ sracha 🌶️

June 18, 2026 · View on GitHub

Anaconda-Server Badge Anaconda-Server Badge

Fast SRA downloader and FASTQ converter, written in pure Rust.

sracha demo

Features

  • Fast -- 5-12x faster than fasterq-dump on typical SRA files
  • One command -- download, convert to FASTQ, and compress
  • Batch input -- accessions, BioProjects (PRJNA), studies (SRP), or a file via --accession-list
  • gzip or zstd output -- parallel compression, or plain FASTQ
  • FASTA output -- --fasta drops quality scores
  • SRA and SRA-lite -- full or simplified quality scores
  • Split modes -- split-3, split-files, split-spot, interleaved
  • Resumable downloads -- picks up where it left off
  • Stdout streaming -- -Z pipes FASTQ straight into downstream tools
  • Integrity checks -- MD5 verification on download and decode
  • Platform support -- Illumina, BGISEQ/DNBSEQ, Element, Ultima, PacBio, Nanopore (legacy 454 and Ion Torrent are not supported)
  • Single static binary -- no Python, no C dependencies

Quick start

# Download, convert, and compress
sracha get SRR28588231

# Download all runs from a BioProject
sracha get PRJNA675068

# Batch download from an accession list
sracha get --accession-list SRR_Acc_List.txt

# Just download
sracha fetch SRR28588231

# Convert a local .sra file
sracha fastq SRR28588231.sra

# Show accession info
sracha info SRR28588231

# Validate a downloaded file
sracha validate SRR28588231.sra

Benchmarks

Local decode (SRA file on disk → FASTQ)

Uncompressed output, measured with hyperfine.

FileSizesrachafasterq-dumpfastq-dumpSpeedup vs fasterq-dump
SRR2858823123 MiB0.15 s1.78 s1.94 s12.3x
SRR2584863288 MiB1.07 s5.53 s12.99 s5.2x
ERR10181731.94 GiB6.45 s33.41 s--5.2x

sracha produces gzipped FASTQ by default (level 1, ~1.4× the uncompressed time on small files thanks to parallel block compression), so the integrated pipeline (sracha get) writes ready-to-use .fastq.gz without a separate gzip step.

Full hyperfine output

SRR28588231 (23 MiB, 66K spots, Illumina paired)

CommandMean [ms]Min [ms]Max [ms]Relative
sracha145.1 ± 2.3141.7149.81.00
fasterq-dump1782.6 ± 11.71769.31794.412.28 ± 0.21
fastq-dump1942.0 ± 3.61938.01945.613.38 ± 0.22

SRR2584863 (288 MiB, Illumina paired)

CommandMean [s]Min [s]Max [s]Relative
sracha1.070 ± 0.0061.0641.0761.00
fasterq-dump5.526 ± 0.0815.4415.6025.16 ± 0.08
fastq-dump12.989 ± 0.03112.96713.02512.14 ± 0.07

ERR1018173 (1.94 GiB, 15.6M spots, Illumina paired, single run)

CommandTime [s]
sracha6.45
fasterq-dump33.41

sracha gzip overhead (SRR28588231, default --gzip-level 1)

CommandMean [ms]Min [ms]Max [ms]Relative
sracha (no compression)150.4 ± 2.5145.7154.41.00
sracha (gzip)213.7 ± 2.8208.9218.31.42 ± 0.03

Benchmarks run with sracha v0.3.8, sra-tools v3.4.1, on Linux (8 CPUs). Install the reference toolkit with pixi run install-sratools and reproduce with validation/benchmark.sh.

Installation

Install via Bioconda:

pixi add bioconda::sracha

Or download pre-built binaries from the releases page, or install from source:

cargo install --git https://github.com/rnabioco/sracha-rs sracha

Containers

Because sracha is on Bioconda, BioContainers automatically publishes a Docker/Singularity image for every release — no local build required.

# Docker / Podman
docker run --rm quay.io/biocontainers/sracha:0.3.7--h54198d6_0 sracha --help

# Singularity / Apptainer
singularity run \
  https://depot.galaxyproject.org/singularity/sracha:0.3.7--h54198d6_0 sracha --help

The tags above are examples — check quay.io for the latest <version>--<build> tag and substitute it in.

In Nextflow, point a process at the image directly or let the conda directive resolve it:

process SRACHA_GET {
    container 'quay.io/biocontainers/sracha:0.3.7--h54198d6_0'
    // or: conda 'bioconda::sracha=0.3.7'
    // ...
}

Documentation

Full CLI reference and usage guide: https://rnabioco.github.io/sracha-rs/

Acknowledgments

sracha builds on the Sequence Read Archive, maintained by the National Center for Biotechnology Information at the National Library of Medicine. The SRA and its toolchain are public-domain software developed by U.S. government employees — our tax dollars at work. Special thanks to Kenneth Durbrow (@durbrow) and the SRA Toolkit team for building and maintaining the infrastructure that makes projects like this possible.

This project wouldn't exist without NCBI's open infrastructure: the VDB/KAR format, the SDL locate API, EUtils, and public S3 hosting of sequencing data. sracha aims to make it easier for the community to build on that foundation.

License

MIT