Accelerated Python Tutorial

November 16, 2025 ยท View on GitHub

This modular tutorial contains content on all things related to accelerated Python:

Brev Launchables of this tutorial should use:

  • L40S, L4, or T4 instances (for non-distributed notebooks).
  • 4xL4 or 2xL4 instances (for distributed notebooks).
  • Crusoe or any other provider with Flexible Ports.

Syllabi

Notebooks

Fundamentals

#ExerciseLinkSolution
01NumPy Intro: ndarray Basics
02NumPy Linear Algebra: SVD Reconstruction
03NumPy to CuPy: ndarray Basics
04NumPy to CuPy: SVD Reconstruction
05Memory Spaces: Power Iteration
06Asynchrony: Power Iteration
07CUDA Core: Devices, Streams and Memory

Libraries

#ExerciseLinkSolution
20cuDF: NYC Parking Violations
21cudf.pandas: NYC Parking Violations
22cuML
23CUDA CCCL: Customizing Algorithms
24nvmath-python: Interop
25nvmath-python: Kernel Fusion
26nvmath-python: Stateful APIs
27nvmath-python: Scaling
28PyNVML

Kernels

#ExerciseLinkSolution
40Kernel Authoring: Copy
41Kernel Authoring: Book Histogram
42Kernel Authoring: Gaussian Blur
43Kernel Authoring: Black and White

Distributed

#ExerciseLinkSolution
60mpi4py
61Dask