TinyNN Quantization Support
March 12, 2025 ยท View on GitHub
Unsupported operators in PyTorch for static quantization
Quantized OPs that are natively not supported by PyTorch (and possibly TFLite). But some of them can be translated to quantized TFLite through extra configuration.
| Operator | Minimum Supported PyTorch Version |
|---|---|
abs | / |
atan | / |
atan2 | / |
bmm | / |
clamp_max | / |
clamp_min | / |
cos | / |
elu | / |
exp | / |
glu | / |
group_norm | / |
hardsigmoid | / |
instance_norm | / |
layer_norm | / |
log | / |
log_softmax | / |
matmul | / |
mm | / |
norm | / |
pad | 1.7.0 |
pow | / |
prelu | / |
reciprocal | / |
rsqrt | / |
silu | / |
sin | / |
softmax | / |
sqrt | / |
std | / |
sum | / |
torch.nn.ConstantPad1d | 1.7.0 |
torch.nn.ConstantPad2d | 1.7.0 |
torch.nn.ConstantPad3d | 1.7.0 |
torch.nn.ConvTranspose2d | 1.7.0 |
torch.nn.GLU | / |
torch.nn.GRU | 1.13.0 |
torch.nn.GroupNorm | / |
torch.nn.Hardsigmoid | / |
torch.nn.InstanceNorm1d | / |
torch.nn.InstanceNorm2d | / |
torch.nn.LSTM | 1.13.0 |
torch.nn.LayerNorm | / |
torch.nn.LogSoftmax | / |
torch.nn.PReLU | / |
torch.nn.RMSNorm | / |
torch.nn.RNN | / |
torch.nn.SiLU | / |
torch.nn.Softmax | / |
torch.nn.ZeroPad2d | 1.7.0 |
truediv | / |
var | / |
Extra flags for translating the above ops to quantized TFLite
| Operators | Notes |
|---|---|
abs | For TFLiteConverter, set rewrite_quantizable=True |
bmm | For TFLiteConverter, set rewrite_quantizable=True |
clamp_max | For TFLiteConverter, set rewrite_quantizable=True |
clamp_min | For TFLiteConverter, set rewrite_quantizable=True |
elu | No action needed |
glu | No action needed |
log_softmax | For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}For TFLiteConverter, set rewrite_quantizable=True |
matmul | For TFLiteConverter, set rewrite_quantizable=True |
prelu | No action needed |
silu | No action needed |
softmax | For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}For TFLiteConverter, set rewrite_quantizable=True |
sum | For TFLiteConverter, set rewrite_quantizable=True |
torch.nn.GLU | No action needed |
torch.nn.Hardsigmoid | No action needed |
torch.nn.LayerNorm | No action needed |
torch.nn.LogSoftmax | For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}For TFLiteConverter, set rewrite_quantizable=True |
torch.nn.PReLU | No action needed |
torch.nn.RMSNorm | No action needed |
torch.nn.SiLU | No action needed |
torch.nn.Softmax | For QATQuantizer/PostQuantizer, set config={"set_quantizable_op_stats": True}For TFLiteConverter, set rewrite_quantizable=True |
truediv | For TFLiteConverter, set rewrite_quantizable=True |
{sqrt, reciprocal} | For TFLiteConverter, set rewrite_quantizable=True |
Supported fusion rules for static quantization
| Operators | Notes |
|---|---|
{add, clamp} | |
{add, relu6} | |
{add, torch.nn.ReLU6} | |
{torch.nn.BatchNorm2d, clamp} | |
{torch.nn.BatchNorm2d, torch.nn.Conv2d} | PTQ only. |
{torch.nn.BatchNorm2d, torch.nn.Conv2d, torch.nn.ReLU} | PTQ only. |
{torch.nn.BatchNorm2d, torch.nn.ReLU} | |
{torch.nn.BatchNorm2d, torch.nn.ReLU6} | |
{torch.nn.BatchNorm3d, torch.nn.ReLU} | |
{torch.nn.BatchNorm3d, torch.nn.ReLU6} | |
{torch.nn.Conv1d, torch.nn.BatchNorm1d} | |
{torch.nn.Conv1d, torch.nn.BatchNorm1d, torch.nn.ReLU} | |
{torch.nn.Conv1d, torch.nn.BatchNorm1d, torch.nn.ReLU6} | |
{torch.nn.Conv1d, torch.nn.ReLU} | |
{torch.nn.Conv1d, torch.nn.ReLU6} | |
{torch.nn.Conv2d, clamp} | |
{torch.nn.Conv2d, torch.nn.BatchNorm2d} | |
{torch.nn.Conv2d, torch.nn.BatchNorm2d, clamp} | |
{torch.nn.Conv2d, torch.nn.BatchNorm2d, torch.nn.ReLU} | |
{torch.nn.Conv2d, torch.nn.BatchNorm2d, torch.nn.ReLU6} | |
{torch.nn.Conv2d, torch.nn.ReLU} | |
{torch.nn.Conv2d, torch.nn.ReLU6} | |
{torch.nn.Conv3d, torch.nn.BatchNorm3d} | |
{torch.nn.Conv3d, torch.nn.BatchNorm3d, torch.nn.ReLU} | |
{torch.nn.Conv3d, torch.nn.BatchNorm3d, torch.nn.ReLU6} | |
{torch.nn.Conv3d, torch.nn.ReLU} | |
{torch.nn.Conv3d, torch.nn.ReLU6} | |
{torch.nn.ConvTranspose1d, torch.nn.BatchNorm1d} | PTQ only. Only PyTorch 1.11.0+ is supported |
{torch.nn.ConvTranspose2d, clamp} | |
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d} | |
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d, clamp} | |
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d, torch.nn.ReLU} | |
{torch.nn.ConvTranspose2d, torch.nn.BatchNorm2d, torch.nn.ReLU6} | |
{torch.nn.ConvTranspose2d, torch.nn.ReLU} | |
{torch.nn.ConvTranspose2d, torch.nn.ReLU6} | |
{torch.nn.ConvTranspose3d, torch.nn.BatchNorm3d} | PTQ only. Only PyTorch 1.11.0+ is supported |
{torch.nn.Linear, clamp} | |
{torch.nn.Linear, torch.nn.BatchNorm1d} | for PTQ, only PyTorch 1.8.0+ is supported |
{torch.nn.Linear, torch.nn.BatchNorm1d, clamp} | |
{torch.nn.Linear, torch.nn.BatchNorm1d, torch.nn.ReLU6} | |
{torch.nn.Linear, torch.nn.ReLU} | |
{torch.nn.Linear, torch.nn.ReLU6} |