Python related notes

December 31, 2025 · View on GitHub

License: MIT PR's Welcome

Python learning and data analysis resources. Please, contribute and get in touch! See MDmisc notes for other programming and genomics-related notes.

Table of content

General

  • An Effective Python Environment: Making Yourself at Home

  • Advanced Jupyter Notebooks: A Tutorial - detailed and illustrated guide

  • NumPy history and principles of array programming for the Python language. Vectorized calculations on arrays, including arithmetic, statistics, trigonometry. Broadcasting (expanding dimensions that differ). Coupling with SciPy and Matplotlib.

    Paper Harris, Charles R., K. Jarrod Millman, Stéfan J. Van Der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser et al. "Array programming with NumPy." Nature 585, no. 7825 (2020): 357-362. https://doi.org/10.1038/s41586-020-2649-2
  • SciPy scientific Python library development, history, algorithms (signal/image processing, plotting, integrals, ODE solvers, optimization, genetic algorithms, splines, parallel programming, many more). Started in 2001, over 100K dependent repositories. Includes 16 data packages (Box2). Documentation.
    Paper SciPy 1.0 Contributors, Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, et al. “SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python.” Nature Methods, February 3, 2020. https://doi.org/10.1038/s41592-019-0686-2.

Cheatsheets

Courses

Data Analysis

Visualization

  • Cosmograph - large graph visualization, machine learning embeddings. Examples, Docs, Python package

  • Clust - Python script for gene clustering without strict requirement of all genes being assigned to clusters. Also, clustering across multiple datasets to find similar patterns. Timecourse clustering. Outperforms seven clustering techniques (cross-clustering, k-means, SOM, MCL, HC, Click, WGCNA) using seven metrics (Davies-Bouldin, BIC, silhouette, Calinski-Harabasz, Ball-Hall, Xu, within-between indices).

    Paper Abu-Jamous, Basel, and Steven Kelly. “Clust: Automatic Extraction of Optimal Co-Expressed Gene Clusters from Gene Expression Data” Genome Biology 19, no. 1 (December 2018) https://doi.org/10.1186/s13059-018-1536-8

Projects

Python Misc