MIST

September 25, 2023 ยท View on GitHub

Medical Image Segmentation Transformer with Convolutional Attention Mixing (CAM) Decoder

Official Implementation of MIST

Full Paper Link

Details of Model

This model represents a Medical Image Segmentation Transformer (MIST) with a Convolutional Attention Mixing (CAM) decoder for medical image segmentation. MIST has two parts - a pre-trained multi-axis vision transformer (MaxViT) is used as an encoder (left side of the network), and the decoder that generates the segmentation maps (right side). Each block of the decoder includes an attention-mixing strategy where attentions computed at different stages are aggregated.

  • Convolutional projected multi-head self-attention (MSA) are used instead of linear MSA to reduce computational cost and capture more salient features.
  • Depth-wise (deep and shallow) convolutions (DWC and SWC) are incorporated to extract relevant semantic features and to increase kernel receptive field for better long-range dependency.

image

Requirements

Datasets

This study uses the Automatic Cardiac Diagnosis Challenge (ACDC) and Synapse multi-organ datasets to evaluate the performance of MIST architecture. You can access ACDC dataset through https://www.creatis.insa-lyon.fr/Challenge/acdc/ and download Synapse dataset through https://www.synapse.org/\#!Synapse:syn3193805/wiki/217789.

Preparing the data for training

Pre-trained models:

Testing (Model Evaluation)

Results

Results on ACDC Dataset

ModelsMean DICERight VentricleMyocardiumLeft Ventricle
TransUNet89.7188.8684.5395.73
SwinUNet90.0088.5585.6295.83
MT-UNet90.4386.6489.0495.62
MISSFormer90.8689.5588.0494.99
PVT-CASCADE91.4688.9089.9795.50
nnUNet91.6190.2489.2495.36
TransCASCADE91.6389.1490.2595.50
nnFormer91.7890.2289.5395.59
Parallel MERIT92.3290.8790.0096.08
MIST (Proposed)92.5691.2390.3196.14

Results on Synapse Dataset

ModelsMean DICEMean HD95AortaGBKLKRLiverPCSPSM
TransUNet77.4831.6987.2363.1381.8777.0294.0855.8685.0875.62
SwinUNet79.1321.5585.4766.5383.2879.6194.2956.5890.6676.60
MT-UNet78.5926.5987.9264.9981.4777.2993.0659.4687.7576.81
MISSFormer81.9618.2086.9968.6585.2182.0094.4165.6791.9280.81
PVT-CASCADE81.0620.2383.0170.5982.2380.3794.0864.4390.183.69
CASTformer82.5522.7389.0567.4886.0582.1795.6167.4991.0081.55
TransCASCADE82.6817.3486.6368.4887.6684.5694.4365.3390.7983.52
Parallel MERIT84.2216.5188.3873.4887.2184.3195.0669.9791.2184.15
MIST (Proposed)86.9211.0789.1574.5893.2892.5494.9472.4392.8387.23

The results for ACDC (upper row) and synapse dataset (lower row) are shown in the following image.

2

Citation and contact

If this repository helped your works, please cite paper below:

Please contact Md Motiur Rahman at rahma112@purdue.edu for any query.