NKD and USKD

October 10, 2023 ยท View on GitHub

ICCV 2023 Paper: From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

architecture

Train

#single GPU
python tools/train.py configs/distillers/imagenet/res18_sd_img.py

#multi GPU
bash tools/dist_train.sh configs/distillers/imagenet/res34_distill_res18_img.py 8

Transfer

# Tansfer the Distillation model into mmcls model
python pth_transfer.py --dis_path $dis_ckpt --output_path $new_mmcls_ckpt

Test

#single GPU
python tools/test.py configs/resnet/resnet18_8xb32_in1k.py $new_mmcls_ckpt --metrics accuracy

#multi GPU
bash tools/dist_test.sh configs/resnet/resnet18_8xb32_in1k.py $new_mmcls_ckpt 8 --metrics accuracy

Results

NKD

ModelTeacherBaseline(Top-1 Acc)+NKD(Top-1 Acc)dis_configweight
ResNet18ResNet3469.9071.96 (+2.06)configbaidu/one drive
MobileNetResNet5069.2172.58 (+3.37)configbaidu/one drive
DeiT-TinyDeiT III-Small74.4276.68 (+2.26)config
DeiT-BaseDeiT III-Large81.7684.96 (+3.20)config

USKD

ModelBaseline(Top-1 Acc)+tf-NKD(Top-1 Acc)dis_config
MobileNet69.2170.38 (+1.17)config
MobileNetV271.8672.41 (+0.55)config
ShuffleNetV269.5570.30 (+0.75)config
ResNet1869.9070.79 (+0.89)config
ResNet5076.5577.07 (+0.52)config
ResNet10177.9778.54 (+0.57)config
RegNetX-1.6GF76.8477.30 (+0.46)config
Swin-Tiny81.1881.49 (+0.31)config
DeiT-Tiny74.4274.97 (+0.55)config

Citation

@inproceedings{yang2023knowledge,
  title={From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels},
  author={Yang, Zhendong and Zeng, Ailing and Yuan, Chun and Li, Yu},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={17185--17194},
  year={2023}
}