Self-Attention experiments in Vision

May 1, 2021 ยท View on GitHub

To do

  • Add RPE, Rotary positional embeddings
  • Fix experiment code, update models to work without separate config
  • Test on TPUv3-8
  • Run first training runs comparing DeiT with absolute learned vs. rotary pos embeddings
  • Add class-attention layers, layerscale (CaiT)
  • Add CvT
  • Add TNT, Twins