Masked Feature Prediction for Self-Supervised Visual Pre-Training

August 6, 2022 · View on GitHub

Chen Wei*, Haoqi Fan, Saining Xie, Chao-Yuan Wu, Alan Yuille, Christoph Feichtenhofer*
In CVPR, 2022. [Paper]

Results & Models

ImageNet-1K; configs are under configs/masked_ssl/

name	top1	config pre-train (PT)	config fine-tune	model PT
ViT-B	84.0	in1k_VIT_B_MaskFeat_PT	in1k_VIT_B_MaskFeat_FT	`link`
ViT-L	85.7	in1k_VIT_L_MaskFeat_PT	in1k_VIT_L_MaskFeat_FT	`link`

Kinetics-400; configs are under configs/masked_ssl/

name	frame length x sample rate	top1	Flops (G) x views	#params (M)	config pre-train (PT)	config fine-tune	model PT
MViT-S	16 x 4	82.2	71 x 1 x 10	36	k400_MVITv2_S_16x4_MaskFeat_PT	k400_MVITv2_S_16x4_FT	`link`
MViT-L	16 x 4	84.3	377 x 1 x 10	218	k400_MVITv2_L_16x4_MaskFeat_PT	k400_MVITv2_L_16x4_FT	`link`

Getting started

To use self-supervised learning techniques please refer to the configs under configs/masked_ssl. For example, the command

python tools/run_net.py \
  --cfg configs/masked_ssl/k400_MVITv2_L_16x4_MaskFeat_PT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_Kinetics_dataset

should train a MaskFeat MViT-L model on the Kinetics-400 dataset, and the command

python tools/run_net.py \
  --cfg configs/masked_ssl/k400_MVITv2_L_16x4_FT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_Kinetics_dataset \
  TRAIN.CHECKPOINT_FILE_PATH path_to_your_pretrain_checkpoint

will fine-tune the resulting model, after passing the checkpoint path to the config.

For images, the command

python tools/run_net.py \
  --cfg configs/masked_ssl/in1k_VIT_B_MaskFeat_PT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_ImageNet_dataset

should train a MaskFeat ViT-B model on the ImageNet dataset, and the command

python tools/run_net.py \
  --cfg configs/masked_ssl/in1k_VIT_B_FT.yaml \
  DATA.PATH_TO_DATA_DIR path_to_your_ImageNet_dataset \
  TRAIN.CHECKPOINT_FILE_PATH path_to_your_pretrain_checkpoint

will fine-tune the resulting model, after passing the checkpoint path to the config.

Reference

If you find this useful for your research, please consider citing the paper using the following BibTeX entry.

@InProceedings{wei2022masked,
    author    = {Wei, Chen and Fan, Haoqi and Xie, Saining and Wu, Chao-Yuan and Yuille, Alan and Feichtenhofer, Christoph},
    title     = {Masked Feature Prediction for Self-Supervised Visual Pre-Training},
    booktitle = {CVPR},
    year      = {2022},
}