how-to-learn-deep-learning-framework

May 20, 2026 · View on GitHub

Learning notes for deep learning framework internals.

This repository collects resources and notes about PyTorch, OneFlow, TorchScript, distributed training, autograd, memory management, operator development, and framework-level performance optimization.

Focus Areas

PyTorch internals: autograd, CUDA extension, data loading, memory management, AMP, TorchScript, Dynamo, AOTAutograd, and performance tuning.
OneFlow internals: execution model, operators, distributed tensors, runtime, VM, and CUDA kernels.
ML systems engineering: framework architecture, operator implementation, and training/runtime optimization.

CUDA and GPU optimization: https://github.com/BBuf/how-to-optim-algorithm-in-cuda
Deep learning compiler notes: https://github.com/BBuf/tvm_mlir_learn

Status

Legacy learning archive. The repository remains public for reference, with English public-facing documentation going forward.

Focus Areas

Related Repositories

Status

Star History