Masked Feature Prediction for Self-Supervised Visual Pre-Training

August 6, 2022 ยท View on GitHub

Chen Wei*, Haoqi Fan, Saining Xie, Chao-Yuan Wu, Alan Yuille, Christoph Feichtenhofer*
In CVPR, 2022. [Paper]


Results & Models

ImageNet-1K; configs are under configs/masked_ssl/

nametop1config pre-train (PT)config fine-tunemodel PT
ViT-B84.0in1k_VIT_B_MaskFeat_PTin1k_VIT_B_MaskFeat_FTlink
ViT-L85.7in1k_VIT_L_MaskFeat_PTin1k_VIT_L_MaskFeat_FTlink

Kinetics-400; configs are under configs/masked_ssl/

nameframe length x sample ratetop1Flops (G) x views#params (M)config pre-train (PT)config fine-tunemodel PT
MViT-S16 x 482.271 x 1 x 1036k400_MVITv2_S_16x4_MaskFeat_PTk400_MVITv2_S_16x4_FTlink
MViT-L16 x 484.3377 x 1 x 10218k400_MVITv2_L_16x4_MaskFeat_PTk400_MVITv2_L_16x4_FTlink

Getting started

To use self-supervised learning techniques please refer to the configs under configs/masked_ssl. For example, the command

python tools/run_net.py \
  --cfg configs/masked_ssl/k400_MVITv2_L_16x4_MaskFeat_PT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_Kinetics_dataset

should train a MaskFeat MViT-L model on the Kinetics-400 dataset, and the command

python tools/run_net.py \
  --cfg configs/masked_ssl/k400_MVITv2_L_16x4_FT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_Kinetics_dataset \
  TRAIN.CHECKPOINT_FILE_PATH path_to_your_pretrain_checkpoint

will fine-tune the resulting model, after passing the checkpoint path to the config.

For images, the command

python tools/run_net.py \
  --cfg configs/masked_ssl/in1k_VIT_B_MaskFeat_PT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_ImageNet_dataset

should train a MaskFeat ViT-B model on the ImageNet dataset, and the command

python tools/run_net.py \
  --cfg configs/masked_ssl/in1k_VIT_B_FT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_ImageNet_dataset \
  TRAIN.CHECKPOINT_FILE_PATH path_to_your_pretrain_checkpoint

will fine-tune the resulting model, after passing the checkpoint path to the config.

Reference

If you find this useful for your research, please consider citing the paper using the following BibTeX entry.

@InProceedings{wei2022masked,
    author    = {Wei, Chen and Fan, Haoqi and Xie, Saining and Wu, Chao-Yuan and Yuille, Alan and Feichtenhofer, Christoph},
    title     = {Masked Feature Prediction for Self-Supervised Visual Pre-Training},
    booktitle = {CVPR},
    year      = {2022},
}