InternImage for Semantic Segmentation

January 24, 2025 · View on GitHub

This folder contains the implementation of the InternImage for semantic segmentation.

Our segmentation code is developed on top of MMSegmentation v0.27.0.

Installation

  • Clone this repository:
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
  • Create a conda virtual environment and activate it:
conda create -n internimage python=3.9
conda activate internimage

For examples, to install torch==1.11 with CUDA==11.3:

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
  • Install other requirements:

    note: conda opencv will break torchvision as not to support GPU, so we need to install opencv using pip.

conda install -c conda-forge termcolor yacs pyyaml scipy pip -y
pip install opencv-python
  • Install timm, mmcv-full and `mmsegmentation':
pip install -U openmim
mim install mmcv-full==1.5.0
mim install mmsegmentation==0.27.0
pip install timm==0.6.11 mmdet==2.28.1
  • Install other requirements:
pip install opencv-python termcolor yacs pyyaml scipy
# Please use a version of numpy lower than 2.0
pip install numpy==1.26.4
pip install pydantic==1.10.13
  • Compile CUDA operators

Before compiling, please use the nvcc -V command to check whether your nvcc version matches the CUDA version of PyTorch.

cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
  • You can also install the operator using precompiled .whl files DCNv3-1.0-whl

Data Preparation

Prepare datasets according to the guidelines in MMSegmentation.

Released Models

Dataset: ADE20K
methodbackboneresolutionmIoU (ss/ms)#paramFLOPsConfigDownload
UperNetInternImage-T512x51247.9 / 48.159M944Gconfigckpt | log
UperNetInternImage-S512x51250.1 / 50.980M1017Gconfigckpt | log
UperNetInternImage-B512x51250.8 / 51.3128M1185Gconfigckpt | log
UperNetInternImage-L640x64053.9 / 54.1256M2526Gconfigckpt | log
UperNetInternImage-XL640x64055.0 / 55.3368M3142Gconfigckpt | log
UperNetInternImage-H896x89659.9 / 60.31.12B3566Gconfigckpt | log
Mask2FormerInternImage-H896x89662.6 / 62.91.31B4635Gconfigckpt | log
Dataset: Cityscapes
methodbackboneresolutionmIoU (ss/ms)#paramsFLOPsConfigDownload
UperNetInternImage-T512x102482.58 / 83.4059M1889Gconfigckpt | log
UperNetInternImage-S512x102482.74 / 83.4580M2035Gconfigckpt | log
UperNetInternImage-B512x102483.18 / 83.97128M2369Gconfigckpt | log
UperNetInternImage-L512x102483.68 / 84.41256M3234Gconfigckpt | log
UperNet*InternImage-L512x102485.94 / 86.22256M3234Gconfigckpt | log
UperNetInternImage-XL512x102483.62 / 84.28368M4022Gconfigckpt | log
UperNet*InternImage-XL512x102486.20 / 86.42368M4022Gconfigckpt | log
SegFormer*InternImage-L512x102485.16 / 85.67220M1580Gconfigckpt | log
SegFormer*InternImage-XL512x102485.41 / 85.93330M2364Gconfigckpt | log
Mask2Former*InternImage-H1024x102486.37 / 86.961094M7878Gconfigckpt | log

* denotes the model is trained using extra Mapillary dataset.

Dataset: COCO-Stuff-164K
methodbackboneresolutionmIoU (ss/ms)#paramsFLOPsConfigDownload
Mask2FormerInternImage-H896x89652.6 / 52.81.31B4635Gconfigckpt | log
Dataset: COCO-Stuff-10K
methodbackboneresolutionmIoU (ss/ms)#paramsFLOPsConfigDownload
Mask2FormerInternImage-H512x51259.2 / 59.61.28B1528Gconfigckpt | log
Dataset: Pascal-Context-59
methodbackboneresolutionmIoU (ss/ms)#paramFLOPsConfigDownload
Mask2FormerInternImage-H480x48069.7 / 70.31.07B867Gconfigckpt | log
Dataset: NYU-Depth-V2
methodbackboneresolutionmIoU (ss/ms)#paramFLOPsConfigDownload
Mask2FormerInternImage-H480x48067.1 / 68.11.07B867Gconfigckpt | log
Dataset: Mapillary
methodbackboneresolution#paramFLOPsConfigDownload
UperNetInternImage-L512x1024256M3234Gconfigckpt
UperNetInternImage-XL512x1024368M4022Gconfigckpt
SegFormerInternImage-L512x1024220M1580Gconfigckpt
SegFormerInternImage-XL512x1024330M2364Gconfigckpt
Mask2FormerInternImage-H896x8961094M7878Gconfigckpt

Evaluation

To evaluate our InternImage on ADE20K val, run:

sh dist_test.sh <config-file> <checkpoint> <gpu-num> --eval mIoU

For example, to evaluate the InternImage-T with a single GPU:

python test.py configs/ade20k/upernet_internimage_t_512_160k_ade20k.py pretrained/upernet_internimage_t_512_160k_ade20k.pth --eval mIoU

For example, to evaluate the InternImage-B with a single node with 8 GPUs:

sh dist_test.sh configs/ade20k/upernet_internimage_b_512_160k_ade20k.py pretrained/upernet_internimage_b_512_160k_ade20k.pth 8 --eval mIoU

Training

To train an InternImage on ADE20K, run:

sh dist_train.sh <config-file> <gpu-num>

For example, to train InternImage-T with 8 GPU on 1 node (total batch size 16), run:

sh dist_train.sh configs/ade20k/upernet_internimage_t_512_160k_ade20k.py 8

Manage Jobs with Slurm

For example, to train InternImage-XL with 8 GPU on 1 node (total batch size 16), run:

GPUS=8 sh slurm_train.sh <partition> <job-name> configs/ade20k/upernet_internimage_xl_640_160k_ade20k.py

Image Demo

To inference a single/multiple image like this. If you specify image containing directory instead of a single image, it will process all the images in the directory.

CUDA_VISIBLE_DEVICES=0 python image_demo.py \
  data/ade/ADEChallengeData2016/images/validation/ADE_val_00000591.jpg \
  configs/ade20k/upernet_internimage_t_512_160k_ade20k.py  \
  checkpoint_dir/seg/upernet_internimage_t_512_160k_ade20k.pth  \
  --palette ade20k

Export

Install mmdeploy at first:

pip install mmdeploy==0.14.0

To export a segmentation model from PyTorch to TensorRT, run:

MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth"

python deploy.py \
    "./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
    "./configs/ade20k/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.png" \
    --work-dir "./work_dirs/mmseg/${MODEL}" \
    --device cuda \
    --dump-info

For example, to export upernet_internimage_t_512_160k_ade20k from PyTorch to TensorRT, run:

MODEL="upernet_internimage_t_512_160k_ade20k"
CKPT_PATH="/path/to/model/ckpt/upernet_internimage_t_512_160k_ade20k.pth"

python deploy.py \
    "./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
    "./configs/ade20k/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.png" \
    --work-dir "./work_dirs/mmseg/${MODEL}" \
    --device cuda \
    --dump-info