GraphPCA

May 7, 2026 ยท View on GitHub

GraphPCA is a novel graph-constrained, interpretable, and quasi-linear dimension-reduction method tailored for spatial transcriptomic data. It leverages the strengths of graphical regularization and Principal Component Analysis (PCA) to extract low-dimensional embeddings of spatial transcriptomes that integrate location information in linear time complexity.


๐Ÿš€ Version 1.0.0: Multi-Engine Architecture

In the v1.0.0 milestone, GraphPCA introduces a comprehensive multi-engine architecture. You can now seamlessly switch between analytical exact solutions and iterative optimization solvers based on your dataset scale.

๐Ÿ› ๏ธ Three Computational Modes

ModeEngineBest ForDescription
exactNumPy/SciPySmall/Standard DataDefault. Direct matrix inversion. Provides the exact analytical solution with zero iteration error.
iterativePython PCGMedium/Large DataPreconditioned Conjugate Gradient (PCG) in pure Python. Balances memory efficiency and scalability.
acceleratedC++ BackendUltra-Large DataHigh-performance C++ implementation (Eigen3). Optimized for million-level datasets with 5x-20x speedup.

โœจ Technical Highlights

  • Precision Control: The exact mode ensures the most rigorous results for traditional ST platforms.
  • Extreme Scalability: The accelerated mode utilizes C++ zero-copy data transfer to shatter the limits of the Python GIL.
  • Graceful Degradation: The system automatically detects your environment. If C++ dependencies are missing, it safely falls back to Python modes to ensure zero-crash installation.

๐Ÿ“– Tutorials

Interactive tutorials and documentation can be found here: https://graphpca-analyses.readthedocs.io/en/latest/index.html

โš ๏ธ API Change Note: Since v0.2.1, Run_GPCA returns Z, W by default for memory efficiency. If your legacy code expects Z, W, ZW_log, please set return_log=True.


๐Ÿ“ฆ Installation

1. Standard Installation (Pure Python)

Supports exact and iterative modes out of the box:

pip install st-graphpca

2. Accelerated Installation (C++ Backend)

Required for accelerated mode. Please install Eigen3 before installing GraphPCA:

# Recommended: Install Eigen via Conda
conda install -c conda-forge eigen

# Build GraphPCA with C++ extension
pip install --no-build-isolation st-graphpca

๐Ÿ’ป Usage Example

from GraphPCA import Run_GPCA

# Choose the mode that fits your data scale:
# 1. Exact solution (Default, for small datasets)
Z, W = Run_GPCA(adata, location, mode="exact")

# 2. Iterative solver (For large datasets, memory efficient)
Z, W = Run_GPCA(adata, location, mode="iterative")

# 3. C++ Accelerated solver (For million-level datasets)
Z, W = Run_GPCA(adata, location, mode="accelerated")

๐Ÿ› ๏ธ Software Dependencies

  • numpy, pandas, scipy
  • matplotlib, scikit-learn
  • networkx, scanpy, squidpy
  • pybind11 (for C++ acceleration)

๐Ÿ“… Recent Changes

  • v1.0.0: Official Stable Release. Introduced the Three-Mode Engine architecture (exact, iterative, accelerated) for full-scale data compatibility.
  • v0.2.1: Improved build compatibility and enhanced graceful fallback for C++ modules.
  • v0.2.0: Major performance release featuring the C++ core and PCG solver.