Post-Training Quantization of SSD PyTorch Model

October 7, 2025 ยท View on GitHub

This example demonstrates how to use Post-Training Quantization API from Neural Network Compression Framework (NNCF) to quantize PyTorch models on the example of SSD300_VGG16 from torchvision library.

The example includes the following steps:

  • Loading the COCO128 dataset (~7 Mb).
  • Loading SSD300_VGG16 from torchvision pretrained on the full COCO dataset.
  • Patching some internal methods with no_nncf_trace context so that the model graph is traced properly by NNCF.
  • Quantizing the model using NNCF Post-Training Quantization algorithm.
  • Output of the following characteristics of the quantized model:
    • Accuracy drop of the quantized model (INT8) over the pre-trained model (FP32).
    • Compression rate of the quantized model file size relative to the pre-trained model file size.
    • Performance speed up of the quantized model (INT8).

Prerequisites

Before running this example, ensure you have Python 3.10+ installed and set up your environment:

1. Create and activate a virtual environment

python3 -m venv nncf_env
source nncf_env/bin/activate  # On Windows: nncf_env\Scripts\activate.bat

2. Install NNCF and other dependencies

python3 -m pip install ../../../../ -r requirements.txt

Run Example

The example does not require any additional preparation, just run:

python main.py