Minitron Pruning
April 30, 2026 · View on GitHub
End-to-end tutorials for Minitron structured pruning followed by knowledge distillation, quantization, evaluation,and vLLM deployment.
Each subdirectory covers a specific source model and target size, including the full data blend, pruning config, distillation hyperparameters, evaluation results, and throughput benchmarks.