Minitron Pruning

April 30, 2026 · View on GitHub

End-to-end tutorials for Minitron structured pruning followed by knowledge distillation, quantization, evaluation,and vLLM deployment.

Each subdirectory covers a specific source model and target size, including the full data blend, pruning config, distillation hyperparameters, evaluation results, and throughput benchmarks.