ImageNet 预训练模型库

January 15, 2025 · View on GitHub

目录

一、模型库概览图

基于 ImageNet1k 分类数据集,PaddleClas 支持 37 个系列分类网络结构以及对应的 217 个图像分类预训练模型,训练技巧、每个系列网络结构的简单介绍和性能评估将在相应章节展现,下面所有的速度指标评估环境如下:

  • Arm CPU 的评估环境基于骁龙 855(SD855)。
  • Intel CPU 的评估环境基于 Intel(R) Xeon(R) Gold 6148。
  • GPU 评估环境基于 V100 机器,在 FP32+TensorRT-8.0.3.4 配置下运行 2100 次测得(去除前 100 次的 warmup 时间)。
  • FLOPs 与 Params 通过 paddle.flops() 计算得到(PaddlePaddle 版本为 2.2)

常见服务器端 CNN 模型的精度指标与其预测耗时的变化曲线如下图所示,其中模型精度为 ImageNet1k 数据集上的 Top1 Acc,预测耗时基于 GPU 环境测得,Batch Size 为 1,FP32 精度。

常见移动端 CNN 模型的精度指标与其预测耗时的变化曲线如下图所示,其中模型精度为 ImageNet1k 数据集上的 Top1 Acc,预测耗时基于 Arm 环境测得,Batch Size 为 1,FP32 精度。

部分 VisionTransformer 模型的精度指标与其预测耗时的变化曲线如下图所示,其中模型精度为 ImageNet1k 数据集上的 Top1 Acc,预测耗时基于 GPU 环境测得,Batch Size 为 1,FP32 精度。

二、SSLD 知识蒸馏预训练模型

基于 SSLD 知识蒸馏的预训练模型列表如下所示,更多关于 SSLD 知识蒸馏方案的介绍可以参考:SSLD 知识蒸馏文档

2.1 服务器端知识蒸馏模型

模型Top-1 AccReference
Top-1 Acc
Acc gaintime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ResNet34_vd_ssld0.7970.7600.0372.003.285.843.9321.84下载链接  下载链接  
ResNet50_vd_ssld0.8300.7920.0392.604.867.634.3525.63下载链接下载链接
ResNet101_vd_ssld0.8370.8020.0354.438.2512.608.0844.67下载链接下载链接
Res2Net50_vd_26w_4s_ssld0.8310.7980.0333.596.359.504.2825.76下载链接下载链接
Res2Net101_vd_
26w_4s_ssld
0.8390.8060.0336.3411.0216.138.3545.35下载链接下载链接
Res2Net200_vd_
26w_4s_ssld
0.8510.8120.04911.4519.7728.8115.7776.44下载链接下载链接
HRNet_W18_C_ssld0.8120.7690.0436.668.9411.954.3221.35下载链接下载链接
HRNet_W48_C_ssld0.8360.7900.04611.0717.0627.2817.3477.57下载链接下载链接
SE_HRNet_W64_C_ssld0.848--17.1126.8743.2429.00129.12下载链接下载链接
PPHGNet_tiny_ssld0.81950.79830.0211.77--4.5414.75下载链接下载链接
PPHGNet_small_ssld0.83820.81510.0232.52--8.5324.38下载链接下载链接

2.2 移动端知识蒸馏模型

模型Top-1 AccReference
Top-1 Acc
Acc gainSD855 time(ms)
bs=1, thread=1
SD855 time(ms)
bs=1, thread=2
SD855 time(ms)
bs=1, thread=4
FLOPs(M)Params(M)模型大小(M)预训练模型下载地址inference模型下载地址
MobileNetV1_ssld0.7790.7100.06930.2417.8610.30578.884.2516下载链接下载链接
MobileNetV2_ssld0.7670.7220.04520.7412.718.10327.843.5414下载链接下载链接
MobileNetV3_small_x0_35_ssld0.5560.5300.0262.231.661.4314.561.676.9下载链接下载链接
MobileNetV3_large_x1_0_ssld0.7900.7530.03616.5510.096.84229.665.5021下载链接下载链接
MobileNetV3_small_x1_0_ssld0.7130.6820.0315.633.652.6063.672.9512下载链接下载链接
GhostNet_x1_3_ssld0.7940.7570.03719.1612.259.40236.897.3829下载链接下载链接

2.3 Intel CPU 端知识蒸馏模型

模型Top-1 AccReference
Top-1 Acc
Acc gainIntel-Xeon-Gold-6148 time(ms)
bs=1
FLOPs(M)Params(M)预训练模型下载地址inference模型下载地址
PPLCNet_x0_5_ssld0.6610.6310.0302.0547.281.89下载链接下载链接
PPLCNet_x1_0_ssld0.7440.7130.0332.46160.812.96下载链接下载链接
PPLCNet_x2_5_ssld0.8080.7660.0425.39906.499.04下载链接下载链接
  • 注: Reference Top-1 Acc 表示 PaddleClas 基于 ImageNet1k 数据集训练得到的预训练模型精度。

三、CNN 系列模型

3.1 服务器端模型

PP-HGNet 系列

PP-HGNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:PP-HGNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
PPHGNet_tiny0.79830.95041.77--4.5414.75下载链接下载链接
PPHGNet_tiny_ssld0.81950.96121.77--4.5414.75下载链接下载链接
PPHGNet_small0.81510.95822.52--8.5324.38下载链接下载链接
PPHGNet_small_ssld0.83820.96812.52--8.5324.38下载链接下载链接
PPHGNet_base_ssld0.85000.97355.97--25.1471.62下载链接下载链接

ResNet 系列 [1]

ResNet 及其 Vd 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:ResNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ResNet180.70980.89921.222.193.631.8311.70下载链接下载链接
ResNet18_vd0.72260.90801.262.283.892.0711.72下载链接下载链接
ResNet340.74570.92141.973.255.703.6821.81下载链接下载链接
ResNet34_vd0.75980.92982.003.285.843.9321.84下载链接下载链接
ResNet34_vd_ssld0.79720.94902.003.285.843.9321.84下载链接下载链接
ResNet500.76500.93002.544.797.404.1125.61下载链接下载链接
ResNet50_vc0.78350.94032.574.837.524.3525.63下载链接下载链接
ResNet50_vd0.79120.94442.604.867.634.3525.63下载链接下载链接
ResNet1010.77560.93644.378.1812.387.8344.65下载链接下载链接
ResNet101_vd0.80170.94974.438.2512.608.0844.67下载链接下载链接
ResNet1520.78260.93966.0511.4117.3311.5660.34下载链接下载链接
ResNet152_vd0.80590.95306.1111.5117.5911.8060.36下载链接下载链接
ResNet200_vd0.80930.95337.7014.5722.1615.3074.93下载链接下载链接
ResNet50_vd_
ssld
0.83000.96402.604.867.634.3525.63下载链接下载链接
ResNet101_vd_
ssld
0.83730.96694.438.2512.608.0844.67下载链接下载链接

ResNeXt 系列 [7]

ResNeXt 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:ResNeXt 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ResNeXt50_
32x4d
0.77750.93825.078.4912.024.2625.10下载链接下载链接
ResNeXt50_vd_
32x4d
0.79560.94625.298.6812.334.5025.12下载链接下载链接
ResNeXt50_
64x4d
0.78430.94139.3913.9720.568.0245.29下载链接下载链接
ResNeXt50_vd_
64x4d
0.80120.94869.7514.1420.848.2645.31下载链接下载链接
ResNeXt101_
32x4d
0.78650.941911.3416.7822.808.0144.32下载链接下载链接
ResNeXt101_vd_
32x4d
0.80330.951211.3617.0123.078.2544.33下载链接下载链接
ResNeXt101_
64x4d
0.78350.945221.5728.0839.4915.5283.66下载链接下载链接
ResNeXt101_vd_
64x4d
0.80780.952021.5728.2239.7015.7683.68下载链接下载链接
ResNeXt152_
32x4d
0.78980.943317.1425.1133.7911.7660.15下载链接下载链接
ResNeXt152_vd_
32x4d
0.80720.952016.9925.2933.8512.0160.17下载链接下载链接
ResNeXt152_
64x4d
0.79510.947133.0742.0559.1323.03115.27下载链接下载链接
ResNeXt152_vd_
64x4d
0.81080.953433.3042.4159.4223.27115.29下载链接下载链接

Res2Net 系列 [9]

Res2Net 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:Res2Net 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
Res2Net50_
26w_4s
0.79330.94573.526.239.304.2825.76下载链接下载链接
Res2Net50_vd_
26w_4s
0.79750.94913.596.359.504.5225.78下载链接下载链接
Res2Net50_
14w_8s
0.79460.94704.397.2110.384.2025.12下载链接下载链接
Res2Net101_vd_
26w_4s
0.80640.95226.3411.0216.138.3545.35下载链接下载链接
Res2Net200_vd_
26w_4s
0.81210.957111.4519.7728.8115.7776.44下载链接下载链接
Res2Net200_vd_
26w_4s_ssld
0.85130.974211.4519.7728.8115.7776.44下载链接下载链接

SENet 系列 [8]

SENet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:SENet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
SE_ResNet18_vd0.73330.91381.482.704.322.0711.81下载链接下载链接
SE_ResNet34_vd0.76510.93202.423.696.293.9322.00下载链接下载链接
SE_ResNet50_vd0.79520.94753.115.999.344.3628.16下载链接下载链接
SE_ResNeXt50_
32x4d
0.78440.93966.3911.0114.944.2727.63下载链接下载链接
SE_ResNeXt50_vd_
32x4d
0.80240.94897.0411.5716.015.6427.76下载链接下载链接
SE_ResNeXt101_
32x4d
0.79390.944313.3121.8528.778.0349.09下载链接下载链接
SENet154_vd0.81400.954834.8351.2269.7424.45122.03下载链接下载链接

DPN 系列 [14]

DPN 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:DPN 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
DPN680.76780.93438.1811.4014.822.3512.68下载链接下载链接
DPN920.79850.948012.4820.0425.106.5437.79下载链接下载链接
DPN980.80590.951014.7025.5535.1211.72861.74下载链接下载链接
DPN1070.80890.953219.4635.6250.2218.3887.13下载链接下载链接
DPN1310.80700.951419.6434.6047.4216.0979.48下载链接下载链接

DenseNet 系列 [15]

DenseNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:DenseNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
DenseNet1210.75660.92583.406.949.172.878.06下载链接下载链接
DenseNet1610.78570.94147.0614.3719.557.7928.90下载链接下载链接
DenseNet1690.76810.93315.0010.2912.843.4014.31下载链接下载链接
DenseNet2010.77630.93666.3813.7217.174.3420.24下载链接下载链接
DenseNet2640.77960.93859.3420.9525.415.8233.74下载链接下载链接

HRNet 系列 [13]

HRNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:HRNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
HRNet_W18_C0.76920.93396.668.9411.954.3221.35下载链接下载链接
HRNet_W18_C_ssld0.811620.958046.668.9411.954.3221.35下载链接下载链接
HRNet_W30_C0.78040.94028.6111.4015.238.1537.78下载链接下载链接
HRNet_W32_C0.78280.94248.5411.5815.578.9741.30下载链接下载链接
HRNet_W40_C0.78770.94479.8315.0220.9212.7457.64下载链接下载链接
HRNet_W44_C0.79000.945110.6216.1825.9214.9467.16下载链接下载链接
HRNet_W48_C0.78950.944211.0717.0627.2817.3477.57下载链接下载链接
HRNet_W48_C_ssld0.83630.968211.0717.0627.2817.3477.57下载链接下载链接
HRNet_W64_C0.79300.946113.8221.1535.5128.97128.18下载链接下载链接
SE_HRNet_W64_C_ssld0.84750.972617.1126.8743.2429.00129.12下载链接下载链接

Inception 系列 [10][11][12][26]

Inception 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:Inception 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
GoogLeNet0.70700.89661.413.255.001.4411.54下载链接下载链接
Xception410.79300.94533.588.7616.618.5723.02下载链接下载链接
Xception41_deeplab0.79550.94383.819.1617.209.2827.08下载链接下载链接
Xception650.81000.95495.4512.7824.5313.2536.04下载链接下载链接
Xception65_deeplab0.80320.94495.6513.0824.6113.9640.10下载链接下载链接
Xception710.81110.95456.1915.3429.2116.2137.86下载链接下载链接
InceptionV30.79140.94594.788.5312.285.7323.87下载链接下载链接
InceptionV40.80770.95268.9315.1721.5612.2942.74下载链接下载链接

EfficientNet 系列 [16]

EfficientNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:EfficientNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
EfficientNetB00.77380.93311.963.715.560.405.33下载链接下载链接
EfficientNetB10.79150.94412.885.407.630.717.86下载链接下载链接
EfficientNetB20.79850.94743.266.209.171.029.18下载链接下载链接
EfficientNetB30.81150.95414.528.8513.541.8812.324下载链接下载链接
EfficientNetB40.82850.96236.7815.4724.954.5119.47下载链接下载链接
EfficientNetB50.83620.967210.9727.2445.9310.5130.56下载链接下载链接
EfficientNetB60.84000.968817.0943.3276.9019.4743.27下载链接下载链接
EfficientNetB70.84300.968925.9171.23128.2038.4566.66下载链接下载链接
EfficientNetB0_
small
0.75800.92581.242.593.920.404.69下载链接下载链接

ResNeXt101_wsl 系列 [17]

ResNeXt101_wsl 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:ResNeXt101_wsl 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ResNeXt101_
32x8d_wsl
0.82550.967413.5523.3936.1816.4888.99下载链接下载链接
ResNeXt101_
32x16d_wsl
0.84240.972621.9638.3563.2936.26194.36下载链接下载链接
ResNeXt101_
32x32d_wsl
0.84970.975937.2876.50121.5687.28469.12下载链接下载链接
ResNeXt101_
32x48d_wsl
0.85370.976955.07124.39205.01153.57829.26下载链接下载链接
Fix_ResNeXt101_
32x48d_wsl
0.86260.979755.01122.63204.66313.41829.26下载链接下载链接

ResNeSt 系列 [24]

ResNeSt 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:ResNeSt 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ResNeSt50_
fast_1s1x64d
0.80350.95282.735.338.244.3626.27下载链接下载链接
ResNeSt500.80830.95427.3610.2313.845.4027.54下载链接下载链接

RegNet 系列 [25]

RegNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:RegNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
RegNetX_4GF0.7850.94166.468.4811.454.0022.23下载链接下载链接

RepVGG 系列 [36]

关于 RepVGG 系列模型的精度、速度指标如下表所示,更多介绍可以参考:RepVGG 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
RepVGG_A00.71310.90161.368.31下载链接下载链接
RepVGG_A10.73800.91462.3712.79下载链接下载链接
RepVGG_A20.75710.92645.1225.50下载链接下载链接
RepVGG_B00.74500.92133.0614.34下载链接下载链接
RepVGG_B10.77730.938511.8251.83下载链接下载链接
RepVGG_B20.78130.941018.3880.32下载链接下载链接
RepVGG_B1g20.77320.93598.8241.36下载链接下载链接
RepVGG_B1g40.76750.93357.3136.13下载链接下载链接
RepVGG_B2g40.78810.944811.3455.78下载链接下载链接
RepVGG_B3g40.79650.948516.0775.63下载链接下载链接

MixNet 系列 [29]

关于 MixNet 系列模型的精度、速度指标如下表所示,更多介绍可以参考:MixNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(M)Params(M)预训练模型下载地址inference模型下载地址
MixNet_S0.76280.92992.313.635.20252.9774.167下载链接下载链接
MixNet_M0.77670.93642.844.606.62357.1195.065下载链接下载链接
MixNet_L0.78600.94373.165.558.03579.0177.384下载链接下载链接

ReXNet 系列 [30]

关于 ReXNet 系列模型的精度、速度指标如下表所示,更多介绍可以参考:ReXNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ReXNet_1_00.77460.93703.084.155.490.4154.84下载链接下载链接
ReXNet_1_30.79130.94643.544.876.540.687.61下载链接下载链接
ReXNet_1_50.80060.95123.685.317.380.909.79下载链接下载链接
ReXNet_2_00.81220.95364.306.549.191.5616.45下载链接下载链接
ReXNet_3_00.82090.96125.749.4913.623.4434.83下载链接下载链接

HarDNet 系列 [37]

关于 HarDNet 系列模型的精度、速度指标如下表所示,更多介绍可以参考:HarDNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
HarDNet39_ds0.71330.89981.402.303.330.443.51下载链接下载链接
HarDNet68_ds0.73620.91522.263.345.060.794.20下载链接下载链接
HarDNet680.75460.92653.588.5311.584.2617.58下载链接下载链接
HarDNet850.77440.93556.2414.8520.579.0936.69下载链接下载链接

DLA 系列 [38]

关于 DLA 系列模型的精度、速度指标如下表所示,更多介绍可以参考:DLA 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
DLA1020.78930.94524.958.0812.407.1933.34下载链接下载链接
DLA102x20.78850.944519.5823.9731.379.3441.42下载链接下载链接
DLA102x0.7810.940011.1215.6020.375.8926.40下载链接下载链接
DLA1690.78090.94097.7012.2518.9011.5953.50下载链接下载链接
DLA340.76030.92981.833.375.983.0715.76下载链接下载链接
DLA46_c0.63210.8531.062.083.230.541.31下载链接下载链接
DLA600.76100.92922.785.368.294.2622.08下载链接下载链接
DLA60x_c0.66450.87541.793.685.190.591.33下载链接下载链接
DLA60x0.77530.93785.989.2412.523.5417.41下载链接下载链接

RedNet 系列 [39]

关于 RedNet 系列模型的精度、速度指标如下表所示,更多介绍可以参考:RedNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
RedNet260.75950.93194.4515.1629.031.699.26下载链接下载链接
RedNet380.77470.93566.2421.3941.262.1412.43下载链接下载链接
RedNet500.78330.94178.0427.7153.732.6115.60下载链接下载链接
RedNet1010.78940.943613.0744.1283.284.5925.76下载链接下载链接
RedNet1520.79170.944018.6663.27119.486.5734.14下载链接下载链接

ConvNeXt 系列 [43]

关于 ConvNeXt 系列模型的精度、速度指标如下表所示,更多介绍可以参考:ConvNeXt 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ConvNeXt_tiny0.82030.9590---4.45828.583下载链接下载链接

VAN 系列 [44]

关于 VAN 系列模型的精度、速度指标如下表所示,更多介绍可以参考:VAN 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
VAN_B00.75350.9299---0.8804.110下载链接下载链接

PeleeNet 系列 [45]

关于 PeleeNet 系列模型的精度、速度指标如下表所示,更多介绍可以参考:PeleeNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
PeleeNet0.71530.9040---0.5142.812下载链接下载链接

CSPNet 系列 [46]

关于 CSPNet 系列模型的精度、速度指标如下表所示,更多介绍可以参考:CSPNet 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
CSPDarkNet530.77250.9355---5.04127.678下载链接下载链接

VGG 系列 [20]

关于 VGG 系列模型的精度、速度指标如下表所示,更多介绍可以参考:VGG 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
VGG110.6930.8911.724.157.247.61132.86下载链接下载链接
VGG130.7000.8942.025.289.5411.31133.05下载链接下载链接
VGG160.7200.9072.486.7912.3315.470138.35下载链接下载链接
VGG190.7260.9092.938.2815.2119.63143.66下载链接下载链接

其他模型

关于 AlexNet [18]、SqueezeNet 系列 [19]、DarkNet53 [21] 等模型的精度、速度指标如下表所示,更多介绍可以参考:其他模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
AlexNet0.5670.7920.811.502.330.7161.10下载链接下载链接
SqueezeNet1_00.5960.8170.681.642.620.781.25下载链接下载链接
SqueezeNet1_10.6010.8190.621.302.090.351.24下载链接下载链接
DarkNet530.7800.9412.796.4210.899.3141.65下载链接下载链接

3.2 轻量级模型

移动端系列 [3][4][5][6][23]

移动端系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:MobileNetV1 系列模型文档MobileNetV2 系列模型文档MobileNetV3 系列模型文档ShuffleNetV2 系列模型文档GhostNet 系列模型文档ESNet 系列模型文档

模型Top-1 AccTop-5 AccSD855 time(ms)
bs=1, thread=1
SD855 time(ms)
bs=1, thread=2
SD855 time(ms)
bs=1, thread=4
FLOPs(M)Params(M)模型大小(M)预训练模型下载地址inference模型下载地址
MobileNetV1_
x0_25
0.51430.75462.881.821.2643.560.481.9下载链接下载链接
MobileNetV1_
x0_5
0.63520.84738.745.263.09154.571.345.2下载链接下载链接
MobileNetV1_
x0_75
0.68810.882317.8410.616.21333.002.6010下载链接下载链接
MobileNetV10.70990.896830.2417.8610.30578.884.2516下载链接下载链接
MobileNetV1_
ssld
0.77890.939430.2417.8610.30578.884.2516下载链接下载链接
MobileNetV2_
x0_25
0.53210.76523.462.512.0334.181.536.1下载链接下载链接
MobileNetV2_
x0_5
0.65030.85727.694.923.5799.481.987.8下载链接下载链接
MobileNetV2_
x0_75
0.69830.890113.698.605.82197.372.6510下载链接下载链接
MobileNetV20.72150.906520.7412.718.10327.843.5414下载链接下载链接
MobileNetV2_
x1_5
0.74120.916740.7924.4915.50702.356.9026下载链接下载链接
MobileNetV2_
x2_0
0.75230.925867.5040.0325.551217.2511.3343下载链接下载链接
MobileNetV2_
ssld
0.76740.933920.7412.718.10327.843.5414下载链接下载链接
MobileNetV3_
large_x1_25
0.76410.929524.5214.769.89362.707.4729下载链接下载链接
MobileNetV3_
large_x1_0
0.75320.923116.5510.096.84229.665.5021下载链接下载链接
MobileNetV3_
large_x0_75
0.73140.910811.537.064.94151.703.9316下载链接下载链接
MobileNetV3_
large_x0_5
0.69240.88526.504.223.1571.832.6911下载链接下载链接
MobileNetV3_
large_x0_35
0.64320.85464.433.112.4140.902.118.6下载链接下载链接
MobileNetV3_
small_x1_25
0.70670.89517.884.913.45100.073.6414下载链接下载链接
MobileNetV3_
small_x1_0
0.68240.88065.633.652.6063.672.9512下载链接下载链接
MobileNetV3_
small_x0_75
0.66020.86334.502.962.1946.022.389.6下载链接下载链接
MobileNetV3_
small_x0_5
0.59210.81522.892.041.6222.601.917.8下载链接下载链接
MobileNetV3_
small_x0_35
0.53030.76372.231.661.4314.561.676.9下载链接下载链接
MobileNetV3_
small_x0_35_ssld
0.55550.77712.231.661.4314.561.676.9下载链接下载链接
MobileNetV3_
large_x1_0_ssld
0.78960.944816.5510.096.84229.665.5021下载链接下载链接
MobileNetV3_small_
x1_0_ssld
0.71290.90105.633.652.6063.672.9512下载链接下载链接
ShuffleNetV20.68800.88459.725.974.13148.862.299下载链接下载链接
ShuffleNetV2_
x0_25
0.49900.73791.941.531.4318.950.612.7下载链接下载链接
ShuffleNetV2_
x0_33
0.53730.77052.231.701.7924.040.652.8下载链接下载链接
ShuffleNetV2_
x0_5
0.60320.82263.672.632.0642.581.375.6下载链接下载链接
ShuffleNetV2_
x1_5
0.71630.901517.2110.566.81301.353.5314下载链接下载链接
ShuffleNetV2_
x2_0
0.73150.912031.2118.9811.65571.707.4028下载链接下载链接
ShuffleNetV2_
swish
0.70030.891731.219.065.74148.862.299.1下载链接下载链接
GhostNet_
x0_5
0.66880.86955.283.953.2946.152.6010下载链接下载链接
GhostNet_
x1_0
0.74020.916512.898.666.72148.785.2120下载链接下载链接
GhostNet_
x1_3
0.75790.925419.1612.259.40236.897.3829下载链接下载链接
GhostNet_
x1_3_ssld
0.79380.944919.1612.259.40236.897.3829下载链接下载链接
ESNet_x0_250.62480.83464.122.972.5130.852.8311下载链接下载链接
ESNet_x0_50.68820.88046.454.423.3567.313.2513下载链接下载链接
ESNet_x0_750.72240.90459.596.284.52123.743.8715下载链接下载链接
ESNet_x1_00.73920.914013.678.715.97197.334.6418下载链接下载链接

PP-LCNet & PP-LCNetV2 系列 [28]

PP-LCNet 系列模型的精度、速度指标如下表所示,更多关于该系列的模型介绍可以参考:PP-LCNet 系列模型文档PP-LCNetV2 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)*
bs=1
FLOPs(M)Params(M)预训练模型下载地址inference模型下载地址
PPLCNet_x0_250.51860.75651.7418.251.52下载链接下载链接
PPLCNet_x0_350.58090.80831.9229.461.65下载链接下载链接
PPLCNet_x0_50.63140.84662.0547.281.89下载链接下载链接
PPLCNet_x0_750.68180.88302.2998.822.37下载链接下载链接
PPLCNet_x1_00.71320.90032.46160.812.96下载链接下载链接
PPLCNet_x1_50.73710.91533.19341.864.52下载链接下载链接
PPLCNet_x2_00.75180.92274.275906.54下载链接下载链接
PPLCNet_x2_50.76600.93005.399069.04下载链接下载链接
模型Top-1 AccTop-5 Acctime(ms)**
bs=1
FLOPs(M)Params(M)预训练模型下载地址inference模型下载地址
PPLCNetV2_base77.0493.274.326046.6下载链接下载链接

*: 基于 Intel-Xeon-Gold-6148 硬件平台与 PaddlePaddle 推理平台。

**: 基于 Intel-Xeon-Gold-6271C 硬件平台与 OpenVINO 2021.4.2 推理平台。

四、Transformer 系列模型

4.1 服务器端模型

ViT 系列 [31]

ViT(Vision Transformer) 系列模型的精度、速度指标如下表所示. 更多关于该系列模型的介绍可以参考: ViT 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
ViT_small_
patch16_224
0.75530.92113.719.0516.729.4148.60下载链接下载链接
ViT_base_
patch16_224
0.81870.96186.1214.8428.5116.8586.42下载链接下载链接
ViT_base_
patch16_384
0.84140.971714.1548.3895.0649.3586.42下载链接下载链接
ViT_base_
patch32_384
0.81760.96134.9413.4324.0812.6688.19下载链接下载链接
ViT_large_
patch16_224
0.83030.965515.5349.5094.0959.65304.12下载链接下载链接
ViT_large_
patch16_384
0.85130.973639.51152.46304.06174.70304.12下载链接下载链接
ViT_large_
patch32_384
0.81530.960811.4436.0970.6344.24306.48下载链接下载链接

DeiT 系列 [32]

DeiT(Data-efficient Image Transformers)系列模型的精度、速度指标如下表所示. 更多关于该系列模型的介绍可以参考: DeiT 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
DeiT_tiny_
patch16_224
0.72080.91123.613.946.101.075.68下载链接下载链接
DeiT_small_
patch16_224
0.79820.94953.616.2410.494.2421.97下载链接下载链接
DeiT_base_
patch16_224
0.81800.95586.1314.8728.5016.8586.42下载链接下载链接
DeiT_base_
patch16_384
0.82890.962414.1248.8097.6049.3586.42下载链接下载链接
DeiT_tiny_
distilled_patch16_224
0.74490.91923.514.056.031.085.87下载链接下载链接
DeiT_small_
distilled_patch16_224
0.81170.95383.706.2010.534.2622.36下载链接下载链接
DeiT_base_
distilled_patch16_224
0.83300.96476.1714.9428.5816.9387.18下载链接下载链接
DeiT_base_
distilled_patch16_384
0.85200.972014.1248.7697.0949.4387.18下载链接下载链接

SwinTransformer 系列 [27][47]

关于 SwinTransformer 系列模型的精度、速度指标如下表所示,更多介绍可以参考:SwinTransformer 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
SwinTransformer_tiny_patch4_window7_2240.81100.95496.599.6816.324.3528.26下载链接下载链接
SwinTransformer_small_patch4_window7_2240.83210.962212.5417.0728.088.5149.56下载链接下载链接
SwinTransformer_base_patch4_window7_2240.83370.964313.3723.5339.1115.1387.70下载链接下载链接
SwinTransformer_base_patch4_window12_3840.84170.967419.5264.56123.3044.4587.70下载链接下载链接
SwinTransformer_base_patch4_window7_224[1]0.85160.974813.5323.4639.1315.1387.70下载链接下载链接
SwinTransformer_base_patch4_window12_384[1]0.86340.979819.6564.72123.4244.4587.70下载链接下载链接
SwinTransformer_large_patch4_window7_224[1]0.86190.978815.7438.5771.4934.02196.43下载链接下载链接
SwinTransformer_large_patch4_window12_384[1]0.87060.981432.61116.59223.2399.97196.43下载链接下载链接

[1]:基于 ImageNet22k 数据集预训练,然后在 ImageNet1k 数据集迁移学习得到。

关于 SwinTransformerV2 系列模型的精度、速度指标如下表所示,更多介绍可以参考:SwinTransformerV2 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
SwinTransformerV2_tiny_patch4_window8_2560.81770.9588---4.321.9下载链接下载链接
SwinTransformerV2_tiny_patch4_window16_2560.82830.9623---4.421.9下载链接下载链接
SwinTransformerV2_small_patch4_window8_2560.83730.9662---8.437.9下载链接下载链接
SwinTransformerV2_small_patch4_window16_2560.84140.9681---8.537.9下载链接下载链接
SwinTransformerV2_base_patch4_window8_2560.84190.9687---15.067.0下载链接下载链接
SwinTransformerV2_base_patch4_window16_2560.84580.9706---15.167.0下载链接下载链接
SwinTransformerV2_base_patch4_window24_384[1]0.87140.9824---34.067.0下载链接下载链接
SwinTransformerV2_large_patch4_window16_256[1]0.86890.9804---33.8149.6下载链接下载链接
SwinTransformerV2_large_patch4_window24_384[1]0.87470.9827---76.1149.6下载链接下载链接

[1]:基于 ImageNet22k 数据集预训练,然后在 ImageNet1k 数据集迁移学习得到。

Twins 系列 [34]

关于 Twins 系列模型的精度、速度指标如下表所示,更多介绍可以参考:Twins 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
pcpvt_small0.81150.95677.3210.5115.273.6724.06下载链接下载链接
pcpvt_base0.82680.962712.2016.2223.166.4443.83下载链接下载链接
pcpvt_large0.83060.965916.4722.9032.739.5060.99下载链接下载链接
alt_gvt_small0.81770.95576.949.0112.272.8124.06下载链接下载链接
alt_gvt_base0.83150.96299.3715.0224.548.3456.07下载链接下载链接
alt_gvt_large0.83640.965111.7622.0835.1214.8199.27下载链接下载链接

:与 Reference 的精度差异源于数据预处理不同。

CSWinTransformer 系列 [40]

关于 CSWinTransformer 系列模型的精度、速度指标如下表所示,更多介绍可以参考:CSWinTransformer 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
CSWinTransformer_tiny_2240.82810.9628---4.122下载链接下载链接
CSWinTransformer_small_2240.83580.9658---6.435下载链接下载链接
CSWinTransformer_base_2240.84200.9692---14.377下载链接下载链接
CSWinTransformer_large_2240.86430.9799---32.2173.3下载链接下载链接
CSWinTransformer_base_3840.85500.9749---42.277下载链接下载链接
CSWinTransformer_large_3840.87480.9833---94.7173.3下载链接下载链接

PVTV2 系列 [41]

关于 PVTV2 系列模型的精度、速度指标如下表所示,更多介绍可以参考:PVTV2 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
PVT_V2_B00.70520.9016---0.533.7下载链接下载链接
PVT_V2_B10.78690.9450---2.014.0下载链接下载链接
PVT_V2_B20.82060.9599---3.925.4下载链接下载链接
PVT_V2_B2_Linear0.82050.9605---3.822.6下载链接下载链接
PVT_V2_B30.83100.9648---6.745.2下载链接下载链接
PVT_V2_B40.83610.9666---9.862.6下载链接下载链接
PVT_V2_B50.83740.9662---11.482.0下载链接下载链接

LeViT 系列 [33]

关于 LeViT 系列模型的精度、速度指标如下表所示,更多介绍可以参考:LeViT 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(M)Params(M)预训练模型下载地址inference模型下载地址
LeViT_128S0.75980.92692817.42下载链接下载链接
LeViT_1280.78100.93723658.87下载链接下载链接
LeViT_1920.79340.944659710.61下载链接下载链接
LeViT_2560.80850.9497104918.45下载链接下载链接
LeViT_3840.81910.9551223438.45下载链接下载链接

:与 Reference 的精度差异源于数据预处理不同及未使用蒸馏的 head 作为输出。

TNT 系列 [35]

关于 TNT 系列模型的精度、速度指标如下表所示,更多介绍可以参考:TNT 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
FLOPs(G)Params(M)预训练模型下载地址inference模型下载地址
TNT_small0.81210.95634.8323.68下载链接下载链接

:TNT 模型的数据预处理部分 NormalizeImage 中的 meanstd 均为 0.5。

4.2 轻量级模型

MobileViT 系列 [42]

关于 MobileViT 系列模型的精度、速度指标如下表所示,更多介绍可以参考:MobileViT 系列模型文档

模型Top-1 AccTop-5 Acctime(ms)
bs=1
time(ms)
bs=4
time(ms)
bs=8
FLOPs(M)Params(M)预训练模型下载地址inference模型下载地址
MobileViT_XXS0.68670.8878---337.241.28下载链接下载链接
MobileViT_XS0.74540.9227---930.752.33下载链接下载链接
MobileViT_S0.78140.9413---1849.355.59下载链接下载链接

五、参考文献

[1] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[2] He T, Zhang Z, Zhang H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 558-567.

[3] Howard A, Sandler M, Chu G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 1314-1324.

[4] Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4510-4520.

[5] Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017.

[6] Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131.

[7] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500.

[8] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.

[9] Gao S, Cheng M M, Zhao K, et al. Res2net: A new multi-scale backbone architecture[J]. IEEE transactions on pattern analysis and machine intelligence, 2019.

[10] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9.

[11] Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[C]//Thirty-first AAAI conference on artificial intelligence. 2017.

[12] Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.

[13] Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. arXiv preprint arXiv:1908.07919, 2019.

[14] Chen Y, Li J, Xiao H, et al. Dual path networks[C]//Advances in neural information processing systems. 2017: 4467-4475.

[15] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708.

[16] Tan M, Le Q V. Efficientnet: Rethinking model scaling for convolutional neural networks[J]. arXiv preprint arXiv:1905.11946, 2019.

[17] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196.

[18] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012: 1097-1105.

[19] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size[J]. arXiv preprint arXiv:1602.07360, 2016.

[20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

[21] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

[22] Ding X, Guo Y, Ding G, et al. Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 1911-1920.

[23] Han K, Wang Y, Tian Q, et al. GhostNet: More features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1580-1589.

[24] Zhang H, Wu C, Zhang Z, et al. Resnest: Split-attention networks[J]. arXiv preprint arXiv:2004.08955, 2020.

[25] Radosavovic I, Kosaraju R P, Girshick R, et al. Designing network design spaces[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 10428-10436.

[26] C.Szegedy, V.Vanhoucke, S.Ioffe, J.Shlens, and Z.Wojna. Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567, 2015.

[27] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin and Baining Guo. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

[28]Cheng Cui, Tingquan Gao, Shengyu Wei, Yuning Du, Ruoyu Guo, Shuilong Dong, Bin Lu, Ying Zhou, Xueying Lv, Qiwen Liu, Xiaoguang Hu, Dianhai Yu, Yanjun Ma. PP-LCNet: A Lightweight CPU Convolutional Neural Network.

[29]Mingxing Tan, Quoc V. Le. MixConv: Mixed Depthwise Convolutional Kernels.

[30]Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo. Rethinking Channel Dimensions for Efficient Model Design.

[31]Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby. AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE.

[32]Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Herve Jegou. Training data-efficient image transformers & distillation through attention.

[33]Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Herve Jegou, Matthijs Douze. LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference.

[34]Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haibing Ren, Xiaolin Wei, Huaxia Xia, Chunhua Shen. Twins: Revisiting the Design of Spatial Attention in Vision Transformers.

[35]Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang. Transformer in Transformer.

[36]Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun. RepVGG: Making VGG-style ConvNets Great Again.

[37]Ping Chao, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, Youn-Long Lin. HarDNet: A Low Memory Traffic Network.

[38]Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell. Deep Layer Aggregation.

[39]Duo Lim Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen. Involution: Inverting the Inherence of Convolution for Visual Recognition.

[40]Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu, Lu Yuan, Dong Chen, Baining Guo. CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows.

[41]Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao. PVTv2: Improved Baselines with Pyramid Vision Transformer.

[42]Sachin Mehta, Mohammad Rastegari. MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer.

[43]Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie. A ConvNet for the 2020s.

[44]Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, Shi-Min Hu. Visual Attention Network.

[45]Robert J. Wang, Xiang Li, Charles X. Ling. Pelee: A Real-Time Object Detection System on Mobile Devices

[46]Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh. CSPNet: A New Backbone that can Enhance Learning Capability of CNN.

[47]Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo. Swin Transformer V2: Scaling Up Capacity and Resolution.