DoViT-code
October 18, 2023 ยท View on GitHub
PyTorch implementation of our paper "Dynamic Token-Pass Transformers for Semantic Segmentation" to appear in WACV'24.
Installation
Requirements:
python >= 3.6
mmsegmentation >= 0.25.0
Please see the documentation of mmsegmentation for details.
Get Started
- Download mmseg pretrained backbones (links in the config files), and convert to DoViT architecture by:
bash tools/model_converters/mmseg2dovit.py </path/to/pretrained_backbone> </path/to/output_backbone>
Note: We modified the EncoderDecoder class in mmseg to support DoViT backbones.
- Training on 8 GPUs:
bash tools/dist_train.sh </path/config_file.py> 8 --work-dir <output_dir>
Citation
@inproceedings{liu2024dovit,
author = {Liu, Yuang and Zhou, Qiang and Wang, Jin and Wang, Zhibin and Wang, Fan and Wang, Jun and Zhang, Wei},
title = {Dynamic Token-Pass Transformers for Semantic Segmentation},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2024}
}