Pre-trained Models & Evaluation & Fine-tuning

May 14, 2024 ยท View on GitHub

Here we provide the pre-trained models and the evaluation/fine-tuning instructions.

ImageNet-1K trained models

These models are also available at Tsinghua Cloud.

Model#Param#FLOPsAcc@1Training Speedup#Equivalent Epochslink
ResNet-5026M4.1G79.7%~1.5x200Google Drive
ConvNeXt-Tiny29M4.5G82.2%~1.5x200Google Drive
ConvNeXt-Small50M8.7G83.2%~1.5x200Google Drive
ConvNeXt-Base89M15.4G83.8%~1.5x200Google Drive
DeiT-Tiny5M1.3G72.5%~3.0x100Google Drive
73.4%~2.0x150Google Drive
73.8%~1.5x200Google Drive
74.4%~1.0x300Google Drive
DeiT-Small22M4.6G79.9%~3.0x100Google Drive
80.6%~2.0x150Google Drive
81.0%~1.5x200Google Drive
81.4%~1.0x300Google Drive
Swin-Tiny28M4.5G80.9%~3.0x100Google Drive
81.4%~2.0x150Google Drive
81.6%~1.5x200Google Drive
Swin-Small50M8.7G82.8%~3.0x100Google Drive
83.1%~2.0x150Google Drive
83.2%~1.5x200Google Drive
Swin-Base88M15.4G83.3%~3.0x100Google Drive
83.5%~2.0x150Google Drive
83.6%~1.5x200Google Drive
CSWin-Tiny23M4.3G82.9%~1.5x200Google Drive
CSWin-Small35M6.9G83.6%~1.5x200Google Drive
CSWin-Base78M15.0G84.3%~1.5x200Google Drive
CAFormer-S1826M4.1G83.4%~1.5x200Google Drive
CAFormer-S3639M8.0G84.3%~1.5x200Google Drive
CAFormer-M3656M13.2G85.0%~1.5x200Google Drive

ImageNet-22K -> ImageNet-1K fine-tuned models

These models are also available at Tsinghua Cloud.

Model#Param#FLOPsAcc@1Pre-training Speeduplink
CSWin-Base-22478M15.0G86.1%~3.0xGoogle Drive
86.3%~2.0xGoogle Drive
CSWin-Base-38478M47.0G87.1%~3.0xGoogle Drive
87.4%~2.0xGoogle Drive
CSWin-Large-224173M31.5G86.9%~3.0xGoogle Drive
87.1%~2.0xGoogle Drive
CSWin-Large-384173M96.8G87.9%~3.0xGoogle Drive
88.1%~2.0xGoogle Drive

Evaluation

We give an example command for evaluating Swin-Tiny:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
    python -m torch.distributed.launch --use-env --nproc_per_node=8 --master_port=12345 main_buffer.py \
    --model swin_tiny --drop_path 0.0 \
    --eval true --batch_size 128 --input_size 224 \
    --data_path /path/to/imagenet-1k \
    --resume /path/to/checkpoint/ET_pp_200ep_swinT.pth

This should yield

* Acc@1 81.626 Acc@5 95.694 loss 0.785
  • For other models, please change --model, --resume, and --input_size accordingly. You can get the pre-trained models from the tables above.
  • Setting a model-specific --drop_path is not required in evaluation, as the DropPath module in timm behaves the same during evaluation, but it is required in training.

ImageNet-22K pre-trained models

These models are also available at Tsinghua Cloud.

Model#Param#FLOPsPre-training Speeduplink
CSWin-Base-22478M15.0G~3.0xGoogle Drive
15.0G~2.0xGoogle Drive
CSWin-Large-224173M31.5G~3.0xGoogle Drive
31.5G~2.0xGoogle Drive

Fine-tuning ImageNet-22K pre-trained models

We give an example command for fine-tuning an ImageNet-22K pre-trained CSWin-Base-224 model on ImageNet-1K:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
    python -m torch.distributed.launch --use-env --nproc_per_node=8 --master_port=12345 main_buffer.py \
    --model CSWin_96_24322_base_224 --drop_path 0.2 --weight_decay 1e-8 \
    --batch_size 64 --lr 5e-5 --update_freq 1 \
    --warmup_epochs 0 --epochs 30 --end_epoch 30 \
    --cutmix 0 --mixup 0 --layer_decay 0.9 --input_size 224 \
    --use_amp true \
    --model_ema true --model_ema_eval true --model_ema_decay 0.9998 \
    --data_path /path/to/imagenet-1k \
    --output_dir /path/to/save/results \
    --finetune /path/to/checkpoint/ET_pp_in22k_pre_trained_speedup2x_cswinB.pth
  • For other models, please change --model, --finetune, and --input_size accordingly. You can get the pre-trained models from the table above.
  • For better performance, --drop_path, --layer_decay, and --model_ema_decay can be adjusted. In our paper, we determine these hyper-parameters on top of the baseline models, and directly use these obtained configurations for fine-tuning our ImageNet-22K pre-trained models.