BEELINE: Benchmarking gEnE reguLatory network Inference from siNgle-cEll transcriptomic data

March 10, 2026 · View on GitHub

Overview of BEELINE

BEELINE is a benchmarking framework for evaluating gene regulatory network (GRN) inference algorithms on single-cell RNA-seq data. It runs algorithms via Docker containers, evaluates their output against a ground truth network, and produces summary plots.

Full documentation: https://murali-group.github.io/Beeline/

Setup

1. Install the conda environment

bash utils/setupAnacondaVENV.sh

2. Pull algorithm Docker images

utils/initialize.sh manages Docker images for all supported BEELINE algorithms. By default it pulls pre-built images from the grnbeeline DockerHub organisation. Pass --build to build images locally from source in Algorithms/ instead.

bash utils/initialize.sh [OPTIONS]
FlagDescription
-b / --buildBuild images locally from source instead of pulling from DockerHub.
-v / --verboseEnable verbose Docker output.
--remove-local-imagesRemove locally built BEELINE images. If combined with --build, images are removed then rebuilt.
--remove-grnbeeline-imagesRemove pulled DockerHub (grnbeeline) images. If combined with --build, images are removed then rebuilt.
-h / --helpDisplay usage information and exit.

3. Activate the environment

source ~/miniconda3/etc/profile.d/conda.sh
conda activate BEELINE

Usage

All three pipeline stages take a YAML configuration file via -c/--config.

1. Run algorithms — BLRunner.py

Runs one or more GRN inference algorithms on the specified datasets.

python BLRunner.py -c config-files/Curated/VSC.yaml

Each algorithm's output is written to outputs/<dataset_id>/<run_id>/<algorithm_id>/rankedEdges.csv.

2. Evaluate results — BLEvaluator.py

Computes evaluation metrics by comparing each algorithm's ranked edge list to the ground truth network.

python BLEvaluator.py -c config-files/Curated/VSC.yaml [flags]
FlagMetric
-a / --aucAUPRC and AUROC
-e / --eprEarly precision ratio
-s / --seprSigned early precision (activation / inhibition)
-r / --spearmanSpearman correlation of predicted edge ranks
-j / --jaccardJaccard index of top-k predicted edges
-t / --timeAlgorithm runtime
-m / --motifsNetwork motif counts in top-k predicted networks
-p / --pathsPath length statistics on top-k predicted networks
-b / --bordaBorda-count edge aggregation across algorithms

3. Plot results — BLPlotter.py

Generates publication-style figures from evaluation output.

python BLPlotter.py -c config-files/Curated/VSC.yaml -o ./plots [flags]
FlagOutputDescription
-a / --auprcAUPRC/<dataset>-AUPRC.{pdf,png}Per-dataset AUPRC plots. One run: precision-recall curve. Multiple runs: box plots.
-r / --aurocAUROC/<dataset>-AUROC.{pdf,png}Per-dataset AUROC plots. One run: ROC curve. Multiple runs: box plots.
-e / --eprEPR/<dataset>-EPR.{pdf,png}Per-dataset box plot of early precision values per algorithm.
--summarySummary.{pdf,png}Heatmap of median AUPRC ratio and Spearman stability.
--epr-summaryEPRSummary.{pdf,png}Heatmap of AUPRC ratio, EPR ratio, and signed EPR ratios.
--allall of the aboveRun all plots.

Configuration

Config files are YAML and follow this structure:

input_settings:
    input_dir: "inputs/Curated"
    datasets:
        - dataset_id: "mHSC"
          nickname: "mHSC-E"      # optional: overrides dataset_id in plot labels
          groundTruthNetwork: "GroundTruthNetwork.csv"
          runs:
            - run_id: "mHSC-500-1"
            - run_id: "mHSC-500-2"

    algorithms:
        - algorithm_id: "GENIE3"
          image: "grnbeeline/arboreto:base"
          should_run: True
          params: {}

        - algorithm_id: "PPCOR"
          image: "grnbeeline/ppcor:base"
          should_run: True
          params:
              pVal: 0.01

output_settings:
    output_dir: "outputs"

input_settings

FieldRequiredDescription
input_dirYesBase directory containing all input datasets. Can be absolute or relative to the working directory.
datasetsYesList of dataset groups. See Dataset fields below.
algorithmsYesList of algorithms to run. See Algorithm fields below.

Dataset fields

FieldRequiredDefaultDescription
dataset_idYesName of the dataset group. Used as a subdirectory under input_dir.
should_runNo[True]Set to [False] to skip this dataset entirely.
groundTruthNetworkNoGroundTruthNetwork.csvFilename of the ground truth edge list CSV, located in the dataset group directory (shared across all runs).
nicknameNodataset_idShort display label used by the plotter for plot titles and heatmap column headers. Does not affect any file paths.
scan_run_subdirectoriesNofalseWhen true, runs are discovered automatically by scanning all subdirectories of input_dir/dataset_id/. Mutually exclusive with runs; an error is raised if no subdirectories are found.
runsNo*List of individual run variants. Required unless scan_run_subdirectories is set. See Run fields below.

Run fields

Each entry under runs represents one replicate or condition variant. Input files are expected at input_dir/dataset_id/run_id/.

FieldRequiredDefaultDescription
run_idYesIdentifier for this run. Used as the subdirectory name within the dataset group directory.
exprDataNoExpressionData.csvExpression data filename, located in the run directory.
pseudoTimeDataNoPseudoTime.csvPseudotime data filename, located in the run directory.

Algorithm fields

FieldRequiredDescription
algorithm_idYesAlgorithm name. Must match one of the supported identifiers (see Supported Algorithms).
imageYesDocker image name to run for this algorithm (e.g., "grnbeeline/genie3:base"). Use "local" for algorithms that run directly in the conda environment without Docker. See the Supported Algorithms table for default image names.
should_runYesSet to True to run this algorithm, False to skip it.
paramsNoDict of algorithm-specific parameters. Values are typically wrapped in a single-element list (e.g., pVal: [0.01]); the runner unwraps them automatically.

output_settings

FieldRequiredDefaultDescription
output_dirYesBase directory for all output files. Can be absolute or relative to the working directory.
experiment_idNoWhen set, inserts an extra path segment between output_dir and the dataset path. Useful for keeping outputs from separate experiment runs (e.g., different parameter sweeps) in the same base directory without overwriting each other.

Output files are written to:

output_dir/[experiment_id/]dataset_id/run_id/algorithm_id/rankedEdges.csv

Preparing Inputs — generateExpInputs.py

generateExpInputs.py is a preprocessing utility for filtering real scRNA-seq expression data down to a biologically meaningful gene subset before running the BEELINE pipeline. It reads a full expression matrix and a gene-ordering file (containing per-gene p-values and optionally variance), retains only genes that pass a significance threshold, and writes a filtered expression matrix and (optionally) a filtered ground truth network.

Basic usage

python generateExpInputs.py \
    -e ExpressionData.csv \
    -g GeneOrdering.csv \
    -f STRING-network.csv \
    -i human-tfs.csv \
    -p 0.01 \
    -n 500 \
    -o my-dataset

This produces my-dataset-ExpressionData.csv and my-dataset-network.csv in the working directory.

Arguments

FlagDefaultDescription
-e / --expFileExpressionData.csvFull expression matrix (genes × cells). Rows are genes (index column), columns are cells.
-g / --geneOrderingFileGeneOrdering.csvGene ordering file indexed by gene name. First column must be a p-value; second column (optional) is per-gene variance used when --sort-variance is active.
-f / --netFile(omit to skip)Ground truth network CSV with Gene1 and Gene2 columns. When provided, the network is filtered to the retained gene set, self-loops and duplicate edges are removed, and the result is written alongside the expression output.
-i / --TFFilehuman-tfs.csvSingle-column CSV of transcription factor names. Used to force-include significantly varying TFs regardless of the non-TF gene count limit.
-p / --pVal0.01Nominal p-value cutoff. Genes with a p-value at or above this threshold are excluded. Set to 0 to disable p-value filtering entirely.
-n / --numGenes500Number of non-TF genes to include after TFs have been separated out. Set to 0 to include TFs only.
-o / --outPrefixBL-Prefix for output filenames. Outputs are written as <prefix>-ExpressionData.csv and <prefix>-network.csv.
-c / --BFcorrenabledApply Bonferroni correction to the p-value cutoff (divides -p by the number of tested genes). Disable with --no-BFcorr.
-t / --TFsenabledForce-include all TFs that pass the p-value cutoff, regardless of the -n gene count limit. Disable with --no-TFs.
-s / --sort-varianceenabledSelect the top -n non-TF genes by variance (highest first). Disable with --no-sort-variance to select by p-value rank instead.

Gene selection logic

  1. Genes in the ordering file that are absent from the expression matrix are warned about and dropped.
  2. The ordering file is sorted by p-value (ascending) and filtered to genes below the cutoff (Bonferroni-corrected if enabled).
  3. If --TFs is set, TFs that pass the cutoff are separated from the non-TF pool and kept unconditionally.
  4. Up to -n non-TF genes are selected — by variance (default) or by p-value rank.
  5. The final gene set is the union of the selected non-TF genes and the retained TFs, sorted alphabetically.

Supported Algorithms

AlgorithmDefault imageRunner file
GENIE3grnbeeline/arboreto:baseBLRun/genie3Runner.py
GRISLIgrnbeeline/grisli:baseBLRun/grisliRunner.py
GRNBOOST2grnbeeline/arboreto:baseBLRun/grnboost2Runner.py
GRNVBEMgrnbeeline/grnvbem:baseBLRun/grnvbemRunner.py
JUMP3jump3:baseBLRun/jump3Runner.py
LEAPgrnbeeline/leap:baseBLRun/leapRunner.py
PEARSONlocalBLRun/pearsonRunner.py
PIDCgrnbeeline/pidc:baseBLRun/pidcRunner.py
PPCORgrnbeeline/ppcor:baseBLRun/ppcorRunner.py
SCODEgrnbeeline/scode:baseBLRun/scodeRunner.py
SCRIBEgrnbeeline/scribe:baseBLRun/scribeRunner.py
SCSGLscsgl:baseBLRun/scsglRunner.py
SINCERITIESgrnbeeline/sincerities:baseBLRun/sinceritiesRunner.py
SINGEgrnbeeline/singe:0.4.1BLRun/singeRunner.py

Citation

If you use BEELINE in your research, please cite:

Pratapa, A., Jalihal, A.P., Law, J.N., Bharadwaj, A., Murali, T.M. (2020) "Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data." Nature Methods, 17, 147–154.


Directory Structure

.
├── BLRunner.py             # Entry point: run algorithms
├── BLEvaluator.py          # Entry point: evaluate results
├── BLPlotter.py            # Entry point: generate plots
├── BLRun/                  # Algorithm runner classes
├── BLEval/                 # Evaluation metric implementations
├── BLPlot/                 # Plot generation implementations
├── config-files/           # YAML configuration files
├── inputs/                 # Input datasets
├── outputs/                # Algorithm outputs (mirrors inputs/ structure)
└── utils/
    ├── generateExpInputs.py    # Utility: filter expression data and network for a gene subset
    ├── initialize.sh           # Pull or build Docker images
    ├── setupAnacondaVENV.sh    # Create/update BEELINE conda environment
    └── environment.yml         # Conda environment specification

Use of Generative AI

For BEELINE v1.1, we prepared portions of this codebase and documentation with assistance from Claude Sonnet 4.6, an AI assistant developed by Anthropic. The authors have reviewed and approved all content and take full responsibility for its accuracy.