Official Pytorch Implementation of DToP

September 28, 2023 · View on GitHub

Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation

Quan Tang, Bowen Zhang, Jiajun Liu, Fagui Liu, Yifan Liu

ICCV 2023. [arxiv]

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for DToP

As shown in the following figure, the network is naturally split into stages using inherent auxiliary blocks.

Highlights

Dynamic Token Pruning We introduce a dynamic token pruning paradigm based on the early exit of easy-to-recognize tokens for semantic segmentation transformers.
Controllable prune ratio One hyperparameter to control the trade-off between computation cost and accuracy.
Generally applicable e apply DToP to mainstream semantic segmentation transformers and can reduce up to 35% computational cost without a notable accuracy drop.

Getting started

requirements

torch==2.0.0 mmcls==1.0.0.rc5, mmcv==2.0.0 mmengine==0.7.0 mmsegmentation==1.0.0rc6

or up-to-date mmxx series till 9 Aug 2023

Training

To aquire the base model

python tools dist_train.sh config/prune/BASE_segvit_ade20k_large.py  $work_dirs$

To prune on the base model

python tools dist_train_load.sh  config/prune/prune_segvit_ade20k_large.py  $work_dirs$  $path_to_ckpt$

Eval

python tools/dist_test.sh  config/prune/prune_segvit_ade20k_large.py  $path_to_ckpt$

Datasets

Please follow the instructions of mmsegmentation data preparation

Results

Ade20k

Method	Backbone	mIoU	GFlops	config
Segvit	Vit-base	49.6	109.9	config
Segvit-prune	Vit-base	49.8	86.8	config
Segvit	Vit-large	53.3	617.0	config
Segvit-prune	Vit-large	52.8	412.8	config

Pascal Context

Method	Backbone	mIoU	GFlops	config	ckpt
Segvit	Vit-large	63.0	315.4	config
Segvit-prune	Vit-large	62.7	224.3	config

COCO-Stuff-10K

Method	Backbone	mIoU	GFlops	config	ckpt
Segvit	Vit-large	47.4	366.9	config
Segvit-prune	Vit-large	47.1	276.2	config

License

For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact the authors.