RankSeg + Segmenter

February 12, 2023 ยท View on GitHub

Results and models

  • Batch-size of all models is set as 8 following the original setting of Segmenter. All models are trained on 8 V-100 GPUs.
  • Multi-Scale test is not conducted on ADE20KFull and COCO+LVIS datasets because of memory limits.
MethodDatasetBackboneCrop SizeLr schdmIoUmIoU(ms+flip)configdownload
SegmenterCOCO-StuffViT-B512x5124000041.943.8configckpt
Segmenter + RankSegCOCO-StuffViT-B512x5124000044.946.2configckpt
SegmenterCOCO-StuffViT-B512x5128000043.445.2configckpt
Segmenter + RankSegCOCO-StuffViT-B512x5128000045.746.7configckpt
SegmenterCOCO-StuffViT-L640x6404000045.547.1configckpt
Segmenter + RankSegCOCO-StuffViT-L640x6404000046.747.9configckpt
SegmenterPascal-Context60ViT-B480x4808000053.854.6config-
Segmenter + RankSegPascal-Context60ViT-B480x4808000054.755.4configckpt
SegmenterADE20KViT-B512x51216000048.850.7configckpt
Segmenter + RankSegADE20KViT-B512x51216000049.751.4configckpt
SegmenterADE20KViT-L640x64016000052.053.6configckpt (official)
Segmenter + RankSegADE20KViT-L640x64016000052.654.4configckpt
SegmenterADE20KFullViT-B512x51216000017.8---
Segmenter + RankSegADE20KFullViT-B512x51216000018.8---
SegmenterCOCO+LVISViT-B512x51232000019.4---
Segmenter + RankSegCOCO+LVISViT-B512x51232000021.3---
SegmenterCOCO+LVISViT-B640x64032000023.7---
Segmenter + RankSegCOCO+LVISViT-B640x64032000024.6---

Ablation Experiments

Ground-Truth Experiments

MethodDatasetBackboneCrop SizeLr schdmIoUconfigdownload
Segmenter + RankSeg + GTCOCO-StuffViT-B512x5124000066.8configckpt
Segmenter + RankSeg + GTPascal-Context60ViT-B480x4808000070.8configckpt
Segmenter + RankSeg + GTADE20KViT-B512x51216000063.6configckpt
Segmenter + RankSeg + GTADE20KFullViT-B512x51216000037.0--
Segmenter + RankSeg + GTCOCO+LVISViT-B512x51232000046.8--

Ablation of Multi-Label Classification Head

MethodDatasetBackboneCrop SizeLr schdmIoUconfigdownload
1 TranEnc LayerCOCO-StuffViT-B512x5124000044.9configckpt
2 TranDec LayersCOCO-StuffViT-B512x5124000044.1configckpt
Global PoolingCOCO-StuffViT-B512x5124000043.2configckpt

Ablation of Class Embedding from Backbone

We additionally found that the performance of Segmenter and Segmenter + RankSeg may be furtherly improved if we introduce class embeddings before the last layer of the backbone and process it jointly with patch encodings.

We mark it as OCE(Optimized Class Embeddings) and list the related result here.

MethodDatasetBackboneCrop SizeLr schdmIoUconfigdownload
Segmenter + OCECOCO-StuffViT-B512x5124000044.1config-
Segmenter + RankSeg + OCECOCO-StuffViT-B512x5124000046.0config-
Segmenter + OCEADE20KViT-B512x51216000048.8config-
Segmenter + RankSeg + OCEADE20KViT-B512x51216000050.1config-