Returns a Polars DataFrame with clusterid and aggregated duplicatecount
June 21, 2026 · View on GitHub
mirpy — Mining Immune Repertoires in Python
mirpy is a Python library for AIRR-seq and immune repertoire analysis.
It provides composable building blocks for parsing, filtering, comparing, and
characterising T-cell and B-cell receptor repertoires.
Contents
- Installation
- Module overview
- Quick start
- Diversity metrics
- Metaclonotypes
- TCRdist
- CDR3 Motif Logos
- Clonotype Association Scan
- ALICE and TCRNET
- Prototype-based embeddings (TCREmp)
- Copilot Agent Workflow
- Resources
- Project status
Installation
Requirements:
- Python 3.11+
- a C/C++ build toolchain for compiled extensions
Install from PyPI:
pip install mirpy-lib
Install from source (one-shot):
git clone https://github.com/antigenomics/mirpy.git
cd mirpy
pip install .
Install from source (editable development mode, conda):
git clone https://github.com/antigenomics/mirpy.git
cd mirpy
./setup.sh # creates the `mirpy` conda env + editable install
conda activate mirpy
setup.sh creates/updates the mirpy conda environment from environment.yml
and installs mirpy in editable mode (it builds the bundled C++ extensions).
Pass --no-conda to install into the already-active environment instead, or
--docs / --test for docs deps and the test suite. The base env is lean;
region-annotation tooling is the optional [arda] extra (pip install -e ".[arda]").
Prefer pip install mirpy-lib for project usage.
Use the cloned repo setup when developing or running docs/notebooks locally.
Module overview
| Package | Responsibilities |
|---|---|
mir.common | Clonotypes, repertoires, parsers, segment libraries |
mir.distances | Aligners, Hamming/Levenshtein search, graph utilities, TCRdist |
mir.basic | Sampling, segment usage, alphabet helpers, Pgen utilities, OLGA germline-retention/trimming profiles + fast PgenLite |
mir.graph | Edit-distance graphs, neighbourhood enrichment, token graphs, single-cell pairing |
mir.embedding | Prototype embeddings: TCREmp, PairedTCREmp |
mir.comparative | Pairwise overlap metrics (Jaccard, D, F, Morisita-Horn), trie-accelerated approximate matching, VDJBet Pgen-matched null distributions |
mir.biomarkers | ALICE enrichment, TCRNET, clonotype association scans, CDR3 sequence logos |
mir.utils | Embedding diagnostics, shared memory, notebook asset helpers |
Quick start
Load a segment library
from mir.common.gene_library import GeneLibrary
lib = GeneLibrary.load_default(
loci={"TRA", "TRB"},
species={"human"},
source="imgt",
)
If a requested organism/locus pair is absent from the default local segment file, mirpy downloads the missing V and J segments from IMGT and appends them automatically.
Parse a clonotype table
from mir.common.parser import VDJtoolsParser
parser = VDJtoolsParser(sep="\t")
clonotypes = parser.parse("example.tsv")
Supported parsers: VDJtoolsParser, AIRRParser, AdaptiveParser,
VDJdbFullPairedParser, and others in mir.common.parser.
VDJdb reference: Shugay M et al. 2018, Nucleic Acids Res., PMID:28977646.
V/J gene allele notation
mirpy uses consistent allele semantics throughout all V/J matching and distance paths.
Matching semantics
| Input form | Behaviour | Matches |
|---|---|---|
TRAV1 (bare) | wildcard | TRAV1, TRAV1*01, TRAV1*02, … |
TRAV1*02 (specific) | exact | TRAV1*02 and bare TRAV1 only |
A bare gene (no * suffix) acts as a wildcard and matches any allele of the
same base gene. A specific allele matches only that exact allele, plus bare
genes (which, having no allele information, cannot exclude any allele).
This applies to all V/J-restricted search paths:
- edit-distance graph construction (
v_call_match) - neighborhood enrichment stats (
match_v_call,match_j_call) - metaclonotype clustering
- association scans (
match_mode="v"/"j"/"vj") - TCRdist
find_metaclonotypes
Library resolution chain for distances
When a V/J gene is looked up in a pre-computed library (e.g., TCRdist germline distances, TCREmP embeddings), mirpy tries in order:
- Exact allele —
TRBV5-1*07as-is - Major allele (
*01) —TRBV5-1*01 - Bare gene —
TRBV5-1(for libraries without allele resolution) - Not found — returns
NaN; propagates to the overall distance
No silent substitution of max-distance sentinels for unknown genes.
Work with repertoires
from mir.common.repertoire import LocusRepertoire
repertoire = LocusRepertoire(clonotypes=clonotypes, locus="TRB")
print(repertoire.duplicate_count) # total read count
print(repertoire.clonotype_count) # unique clonotypes
# Functional / canonical filtering using IMGT annotations
from mir.common.filter import filter_functional, filter_canonical
from mir.common.gene_library import GeneLibrary
imgt_lib = GeneLibrary.load_default(loci={"TRB"}, species={"human"}, source="imgt")
functional_rep = filter_functional(repertoire, gene_library=imgt_lib)
canonical_rep = filter_canonical(repertoire, gene_library=imgt_lib)
Pool repertoires across samples
from mir.common.pool import pool_samples
# Pool by amino-acid CDR3 + V/J; retain contributing sample IDs
pooled = pool_samples(dataset, rule="aavj", include_sample_ids=True)
Supported pooling rules: ntvj, nt, aavj, aa.
For each rule the representative clonotype is selected by frequency
(duplicate_count when weighted=True); duplicate_count is reassigned to
the total sum, and incidence / occurrences metadata are added.
Diversity metrics
mir.common.diversity implements VDJtools-style summary indices
(Shugay et al. 2015, PMID:26606115) and iNEXT-style Hill diversity profiles
and rarefaction/extrapolation curves (Hsieh et al. 2016).
Summary statistics
from mir.common.diversity import summarize_counts
counts = [c.duplicate_count for c in repertoire.clonotypes]
div = summarize_counts(counts)
print(div.abundance) # total read count
print(div.diversity) # observed richness
print(div.chao1) # bias-corrected Chao1 species richness estimator
print(div.shannon) # Shannon entropy H′
print(div.gini_simpson) # Gini-Simpson index (1 − Σp²)
print(div.singletons) # clones seen exactly once
print(div.doubletons) # clones seen exactly twice
Hill diversity profile
from mir.common.diversity import hill_curve
# Returns a Polars DataFrame with columns q, D_q
profile = hill_curve(counts)
# q=0 → species richness; q=1 → exp(Shannon); q=2 → inverse Simpson
Rarefaction / extrapolation curve
from mir.common.diversity import rarefaction_curve
curve = rarefaction_curve(counts)
# Polars DataFrame with m, s_obs, s_est, s_lwr, s_upr, sample coverage C
See notebooks/diversity_analysis.ipynb for a full donor-level workflow
including rarefaction curves, Hill profiles, and Healthy vs MS cohort
comparisons.
Metaclonotypes
A metaclonotype is a lightweight cluster layer over an existing
LocusRepertoire. It stores cluster membership as a Polars DataFrame
(mapping cluster_id → clonotype_id) without rebuilding repertoire objects.
This supports any clustering backend: DBSCAN, ALICE/TCRNET enriched clusters,
TCRdist radius clusters, or pre-computed connected components.
Unified clustering interface
MetaclonotypeClusterConfig + cluster_metaclonotypes dispatch to any
supported backend (ALICE, TCRNET, TCRdist, edit-distance graph, TCREmp, GLIPH):
from mir.biomarkers.metaclonotype_cluster import (
MetaclonotypeClusterConfig,
cluster_metaclonotypes,
cluster_paired_metaclonotypes,
)
# Edit-distance graph, Leiden communities
cfg = MetaclonotypeClusterConfig(method="edit_distance", graph_algo="leiden")
meta = cluster_metaclonotypes(rep, cfg)
# TCRdist radius clusters
cfg_dist = MetaclonotypeClusterConfig(method="tcrdist", locus="TRB", max_distance=24.5)
meta_dist = cluster_metaclonotypes(rep, cfg_dist)
Paired-chain metaclonotypes via single-chain-combine (works for all methods):
# Computes per-chain edit-distance clusters, combines IDs as "TRA_cluster.TRB_cluster"
cfg = MetaclonotypeClusterConfig(method="edit_distance", min_cluster_size=1)
meta_paired = cluster_paired_metaclonotypes(paired_locus_rep, cfg)
For TCREmp, cluster_paired_metaclonotypes uses the built-in PairedTCREmp
joint embedding by default. See notebooks/metaclonotype_method_compare.ipynb
for a comparison of methods including concordance analysis.
Build metaclonotypes from cluster labels
from mir.common.metaclonotype import metaclonotypes_from_labels
# labels is a list of ints; -1 denotes noise/singleton (excluded by default)
meta = metaclonotypes_from_labels(clonotype_ids, labels)
print(meta.n_clusters) # number of clusters
print(meta.cluster_ids[:5]) # sorted cluster IDs
Build from pre-computed connected components
from mir.common.metaclonotype import metaclonotypes_from_components
# components: list of lists of clonotype IDs
meta = metaclonotypes_from_components(components)
Summarise cluster abundance
from mir.common.metaclonotype import summarize_metaclonotypes
# Returns a Polars DataFrame with cluster_id and aggregated duplicate_count
summary = summarize_metaclonotypes(repertoire, meta)
Functional diversity of the metaclonotype layer
from mir.common.metaclonotype import functional_diversity
# One-call wrapper: summarize → DiversitySummary
div = functional_diversity(repertoire, meta)
print(div.shannon) # Shannon entropy at the cluster level
print(div.chao1) # Chao1 estimator for cluster richness
print(div.gini_simpson) # Gini-Simpson index
Cross-repertoire functional overlap
from mir.common.metaclonotype import functional_overlap_1
# Fraction of metaclonotypes in rep_a that share a CDR3 identity with rep_b
overlap = functional_overlap_1(meta_a, meta_b, repertoire_a, repertoire_b)
TCRdist
TcrDist (mir.distances.tcrdist) computes the weighted V-gene + CDR3
alignment distance between TCR clonotypes, following the TCRdist3 metric
(Dash et al. 2017). All V-gene pairwise distances are pre-computed once
from full germline sequences; CDR3 alignment uses BLOSUM62 with a fixed-gap
C extension that releases the GIL for thread parallelism.
Gene input robustness: allele suffixes are optional for V/J genes in distance
calls. For example, TRBV19 and TRBJ2-7 are automatically interpreted as
TRBV19*01 and TRBJ2-7*01 before allele-indexed matrix lookup.
from mir.distances.tcrdist import TcrDist
from mir.common.clonotype import Clonotype
# Build once — loads OLGA library and pre-computes V-gene distances (~3–10 s)
td = TcrDist.from_defaults(
"TRB", "human",
w_v=1.0, w_j=0.0, w_cdr3=3.0,
fixed_gaps=(3, 4, -4, -3), # C-accelerated (default)
# fixed_gaps="Mid" → midpoint gap per pair (Python, ~330× slower)
# fixed_gaps=None → full BioPython DP (~780× slower)
)
cln1 = Clonotype(v_call="TRBV19*01", j_call="TRBJ2-7*01", junction_aa="CASSIRSSYEQYF")
cln2 = Clonotype(v_call="TRBV19*01", j_call="TRBJ2-7*01", junction_aa="CASSIRASYEQYF")
d = td.dist(cln1, cln2) # single pair
row = td.dist_one_to_many(cln1, refs) # (K,) array
mat = td.dist_matrix(queries, refs, n_jobs=4) # (N, K) matrix
Radius and metaclonotype discovery
from mir.basic.pgen import OlgaModel
model = OlgaModel(locus="TRB", species="human")
bg_seqs, _ = model.generate_sequences_counted(10_000, n_jobs=4, seed=42)
bg_clns = [Clonotype(junction_aa=s, locus="TRB") for s in bg_seqs]
# Median background distance for each query clonotype
radii = td.compute_radius(hits, bg_clns, percentile=50, n_jobs=4)
# Cluster around seeds whose radius falls in the bottom quartile
import numpy as np
threshold = float(np.percentile(radii, 25))
meta = td.find_metaclonotypes(rep, max_distance=threshold, n_jobs=4)
Performance (Apple M3, TRB, fixed_gaps=(3,4,-4,-3), n_jobs=1):
28 M pairs/s at 1K–5K scale; ~76 M pairs/s with n_jobs=8.
See notebooks/tcrdist_analysis.ipynb for an influenza GILGFVFTL worked example.
CDR3 Motif Logos
mir.biomarkers.motif_logo builds IC and selection sequence logos for
CDR3 motifs, following Pogorelyy et al. 2019 (PMID:31194732). The key
idea is to subtract an OLGA-derived background for the same V-gene / J-gene /
CDR3-length bin, which collapses the germline signal and reveals only the
antigen-driven component.
from mir.biomarkers.motif_logo import (
compute_pwm, compute_logo, get_vj_background,
build_terminal_anchored_pwm, load_motif_pwms, plot_logo,
)
motif_pwms = load_motif_pwms("motif_pwms.txt.gz") # OLGA backgrounds
seqs = ["CASSGRSYEQYF", "CASSGRTNEQYF", ...] # CDR3 sequences
bg = get_vj_background(
motif_pwms, v_call="TRBV19*01", j_call="TRBJ2-7*01",
length=13, species="HomoSapiens", gene="TRB",
)
pwm = compute_pwm(seqs)
logo = compute_logo(pwm, background=bg) # adds ic_height + bg_height columns
fig, ax = plt.subplots()
plot_logo(logo, ax, height_col="bg_height") # selection logo
For CDR3s of mixed lengths (different J-genes), use the terminal-anchored logo which anchors V-side and J-side blocks independently:
ta_pwm = build_terminal_anchored_pwm(seqs, n_term=8, c_term=7)
ta_logo = compute_logo(ta_pwm, background=bg)
For automated per-VJ-len logos from ALICE/TCRNET hit DataFrames use
build_motif_logos_vj. Background data (motif_pwms.txt.gz) is fetched
automatically by the notebook bootstrap helpers in mir.utils.notebook_assets.
See notebooks/motif_logos.ipynb for GILGFVFTL (Influenza A) and HLA-B27 AS
worked examples.
Clonotype Association Scan
mir.biomarkers.associations provides direct clonotype-to-metadata association
analysis for sample cohorts, including binary/multiclass labels, paired
clonotype support, and co-occurrence tests.
from mir.biomarkers.associations import (
AssociationParams,
associate_clonotype_metadata,
build_public_clonotype_panel,
)
# build candidate targets from public clonotypes in the cohort
targets = build_public_clonotype_panel(samples, locus="TRB", min_sample_fraction=0.03)
# sample-level Fisher test (auto-selects Fisher for binary labels)
res = associate_clonotype_metadata(
samples,
targets,
metadata_field="COVID_status",
metadata_value=["COVID", "healthy"],
params=AssociationParams(count_mode="sample", test="auto"),
)
# depth-aware mode (GLM with depth covariate) for uneven sequencing depth
res_depth = associate_clonotype_metadata(
samples,
targets,
metadata_field="COVID_status",
metadata_value=["COVID", "healthy"],
params=AssociationParams(count_mode="rearrangement", test="depth_glm"),
)
Returned table fields include per-group detected/background counts, p-values, BH-adjusted q-values, and odds-ratio estimates. Depth-aware mode reports a depth-adjusted group effect and falls back to table tests if GLM assumptions are not satisfied.
For an end-to-end COVID biomarker workflow (functional filtering, first-pass
batch correction, re-normalization, Fisher + depth-aware scans, and reference
concordance checks), see notebooks/covid19_biomarkers.ipynb and
tests/test_associations_covid19_benchmark.py.
ALICE and TCRNET
Both modules detect antigen-driven CDR3 clusters, but differ in how they model the background:
| Feature | ALICE | TCRNET |
|---|---|---|
| Background model | OLGA Pgen (analytical or MC pool) | Any MC control — real or synthetic |
| Pgen calls | OLGA 1mm Pgen (10M pool + fallback) | None |
| V/J restriction | match_mode="vj" (default) | match_mode="none" (default) |
| Statistics | Poisson | Binomial / Beta-Binomial |
| Selection correction | q_factor | q_factor (needed for synthetic controls) |
ALICE (mir.biomarkers.alice) implements the Pogorelyy et al. 2019
(PMID:31194732) Poisson enrichment test. Default is match_mode="vj" with
OLGA gene-usage conditioning: N and pgen are scaled by P_OLGA(V,J) so
that λ = N_total × pgen regardless of gene restriction, while the observed
count k is V/J-filtered. Uses a 10 M-sequence MC pool by default (the
paper uses 100 M, which requires ~17 GB; set mc_n_pool=100_000_000 if memory
allows) with fallback to OLGA analytical 1mm Pgen for rare sequences.
V/J-restricted neighbour counting uses a grouped-trie strategy: one small
trie per (V, J) gene group. This makes match_mode="vj" 1.5–2× faster
than match_mode="none" on natural repertoires (benchmark: 300 K sequences,
8 workers — unrestricted 9.9 s, V+J restricted 5.5 s).
TCRNET (mir.biomarkers.tcrnet) is a purely MC-control algorithm
(Lupyr et al. 2025, Brief. Bioinform., PMID:40996146).
When used with a real control it captures V/J bias automatically. Pass
q_factor ≈ 3–5 when using a synthetic OLGA pool to correct for the
pre-thymic selection deficit. TCRNET with a 100 M synthetic pool,
match_mode="vj", and q_factor=Q is statistically equivalent to the
original ALICE paper.
Prototype-based embeddings with TCREmp
TCREmp embeds immune receptor clonotypes as distance vectors to a fixed set
of prototype clonotypes, enabling rapid downstream analysis, dimensionality
reduction, and machine learning
(Kremlyakova et al. 2025, J. Mol. Biol., PMID:40368275).
Gene input robustness: allele suffixes are optional for v_call / j_call in
embedding input. Missing suffixes are normalized to *01 before matrix lookup
so TRBV5-1 behaves like TRBV5-1*01.
from mir.embedding.tcremp import TCREmp
from mir.common.clonotype import Clonotype
model = TCREmp.from_defaults("human", "TRB", n_prototypes=1000, junction_method="fixed_gap")
clonotypes = [
Clonotype(v_call="TRBV10-3*01", j_call="TRBJ2-7*01", junction_aa="CASSIRSSYEQYF"),
Clonotype(v_call="TRBV20-1*01", j_call="TRBJ1-1*01", junction_aa="CSARDSSYEQYF"),
]
X = model.embed(clonotypes) # shape (2, 3000) — float32 array
Column layout: [v_1, j_1, junc_1, v_2, j_2, junc_2, …, v_K, j_K, junc_K]
where each distance uses d(a, b) = s(a,a) + s(b,b) − 2·s(a,b).
For full DP alignment use junction_method="biopython" (~383× slower).
For custom prototypes use TCREmp.from_file("prototypes.tsv", ...).
Paired-chain embedding concatenates TRA and TRB embeddings per PairedClonotype:
from mir.embedding.tcremp import PairedTCREmp
paired_model = PairedTCREmp.from_defaults("human", "TRA_TRB", n_prototypes=500)
X_pair = paired_model.embed(paired_clonotypes)
n_jobs behaviour:
n_jobs=None(default): auto-select based onlen(clonotypes) × n_prototypes.n_jobs=1: force serial.n_jobs>1: force explicit worker count.
Mask and match sequences
from mir.basic.alphabets import (
aa_to_reduced, mask, matches, matches_aa_reduced, NT_MASK, AA_MASK,
)
nt_masked = mask("ATCGAT", (2, 5), NT_MASK)
assert nt_masked == b"ATNNNT"
aa = "CASTIV"
reduced = aa_to_reduced(aa)
# Matching ignores mask symbols: N (nucleotide) or X (amino acid)
assert matches(mask(aa, 0, AA_MASK), aa, AA_MASK)
assert matches_aa_reduced(aa, mask(reduced, 3, AA_MASK))
COVID-19 TCR Biomarker Notebooks
Three companion notebooks replicate and extend the findings of Vlasova et al. (2026) Genome Med. DOI:10.1186/s13073-025-01589-4 using 1 137 paired AIRR donors (761 COVID-19 / 376 healthy).
covid19_biomarkers.ipynb — Global Fisher scan and SVM classifier
- Public CDR3 panel: 4 093 TRB + 4 TRA candidates (≥ 5 % prevalence).
- 39 TRB + 4 TRA CDR3s reach FDR < 0.05 (BH, one-sided Fisher); all TRB hits are healthy-enriched (public clonotype dilution by SARS-specific expansion), all TRA hits are COVID-enriched.
- SVM classifier (RBF kernel, log-frequency features, 5-fold CV): AUC ≈ 0.70, replicating the paper's reported performance.
covid19_hla_biomarkers.ipynb — HLA × TCR stratification
- HLA allele–stratified sub-cohort Fisher tests (DRB1*16: n = 76, DQB1*05: n = 352).
- Focused TRBV12-3/CASS replication test (1 297 pre-specified candidates,
BH FDR within this set):
- CASSRTGTGSSYNSPLHF (TRBV12-3) — 26 COVID / 0 healthy DRB1*16 donors, log₂FE = 4.38, FDR = 0.035.
- 8 additional TRBV12-3 CDR3s with nc ≥ 5 / nh = 0.
- Global HLA × CDR3 scan (83 alleles × 43 significant CDR3s): CAGQLYGGSQGNLIF depleted in HLA-DPB1*02:01 donors (log₂FE = −1.51, q = 0.003).
covid19_pairing_biomarkers.ipynb — TRA × TRB co-occurrence and VDJdb overlap
- 156 TRA × TRB biomarker pairs tested (4 COVID-enriched TRA × 39 healthy-enriched TRB) across all-donor, COVID-only, and healthy-only strata.
- One significant negative co-occurrence in all donors: CALSEETSGSRLTF × CASSLGGGDTQYF (q = 0.027).
- Healthy-only positive co-occurrence: CAGQNYGGSQGNLIF with CASSLGETQYF (q = 0.001) and CASSPSTDTQYF (q = 0.013).
- VDJdb 2025-12 cross-validation (Hamming ≤ 1, V-gene fixed): 3 / 4 TRA CDR3s confirmed as SARS-CoV-2-specific; 2 (CAGQNYGGSQGNLIF, CAGQLYGGSQGNLIF) target the Spike epitope NCTFEYVSQPFLMDL via TRAV35*01 under HLA-DRB1*04:05 (class II); 15 / 39 TRB CDR3s match VDJdb records.
Copilot Agent Workflow
This repository ships a dedicated Copilot custom agent and companion prompt:
- Agent:
.github/agents/mirpy-analysis.agent.md - Companion prompt:
.github/prompts/mirpy-analysis.prompt.md
Use /mirpy-analysis from chat to supply input data paths, optional metadata
schema, and workflow definition. The agent creates dedicated notebooks,
installs/validates dependencies, executes cells sequentially, and reports
outcomes. For large datasets it benchmarks small chunks first and asks before
any run expected to exceed ~10–15 min on 4–8 cores or ~12–16 GB RAM.
Resources
- Example notebooks: notebooks/
- API reference: https://antigenomics.github.io/mirpy/modules.html
- Notebook gallery: https://antigenomics.github.io/mirpy/examples.html
- Docs source: docs/
- Agent skill guide (Claude, GitHub Copilot): skills/mirpy/SKILL.md
- Benchmark baselines: benchmarks.md
Skill packaging note: skills/mirpy/ is the single source of truth for agent skills.
The mirpy install skills CLI command installs from this directory.
Editable installs (git clone + ./setup.sh) are required for mirpy install skills to work.
References
If you use mirpy in your work, please cite the relevant methods:
| Method / data | Citation |
|---|---|
| VDJtools diversity metrics | Shugay M et al. (2015) PLoS Comput. Biol. PMID:26606115 |
| ALICE enrichment | Pogorelyy MV et al. (2019) PLoS Biol. PMID:31194732 |
| TCRNET neighbourhood enrichment | Lupyr KR et al. (2025) Brief. Bioinform. PMID:40996146 |
| TCREmp prototype embeddings | Kremlyakova Y et al. (2025) J. Mol. Biol. PMID:40368275 |
| VDJdb antigen-specific TCR database | Shugay M et al. (2018) Nucleic Acids Res. PMID:28977646 |
| VDJdb SARS-CoV-2 update | Goncharov M et al. (2022) Nat. Methods PMID:35970936 |
| Antigen-specificity annotation framework | Pogorelyy MV & Shugay M (2019) Front. Immunol. PMID:31616409 |
| TCRdist (V-gene + CDR3 distance) | Dash P et al. (2017) Nature PMID:28636592 |
| CDR3 motif logos / selection logos | Pogorelyy MV et al. (2019) PLoS Biol. PMID:31194732 |
| T cell repertoire aging dynamics | Britanova OV et al. (2016) J. Immunol. PMID:27183615 |
| Pre-immune antigen-specific landscape | Pogorelyy MV et al. (2018) Genome Med. PMID:30144804 |
| COVID-19 TCR biomarker SVM classifier | Vlasova EK et al. (2026) Genome Med. DOI:10.1186/s13073-025-01589-4 |
Project status
The library is actively evolving. Some modules are more mature than others, and parts of the public API may still change.