InternImage for Object Detection

March 4, 2025 ยท View on GitHub

This folder contains the implementation of the InternImage for object detection.

Our detection code is developed on top of MMDetection v2.28.1.

Installation

  • Clone this repository:
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
  • Create a conda virtual environment and activate it:
conda create -n internimage python=3.9
conda activate internimage

For examples, to install torch==1.11 with CUDA==11.3:

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
  • Install other requirements:

    note: conda opencv will break torchvision as not to support GPU, so we need to install opencv using pip.

conda install -c conda-forge termcolor yacs pyyaml scipy pip -y
pip install opencv-python
  • Install timm, mmcv-full and `mmsegmentation':
pip install -U openmim
mim install mmcv-full==1.5.0
mim install mmsegmentation==0.27.0
pip install timm==0.6.11 mmdet==2.28.1
  • Install other requirements:
pip install opencv-python termcolor yacs pyyaml scipy
# Please use a version of numpy lower than 2.0
pip install numpy==1.26.4
pip install pydantic==1.10.13
pip install yapf==0.40.1
  • Compile CUDA operators

Before compiling, please use the nvcc -V command to check whether your nvcc version matches the CUDA version of PyTorch.

cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
  • You can also install the operator using precompiled .whl files DCNv3-1.0-whl

Data Preparation

Prepare datasets according to the guidelines in MMDetection v2.28.1.

Released Models

Dataset: COCO
methodbackboneschdbox mAPmask mAP#paramFLOPsConfigDownload
Mask R-CNNInternImage-T1x47.242.549M270Gconfigckpt | log
Mask R-CNNInternImage-T3x49.143.749M270Gconfigckpt | log
Mask R-CNNInternImage-S1x47.843.369M340Gconfigckpt | log
Mask R-CNNInternImage-S3x49.744.569M340Gconfigckpt | log
Mask R-CNNInternImage-B1x48.844.0115M501Gconfigckpt | log
Mask R-CNNInternImage-B3x50.344.8115M501Gconfigckpt | log
CascadeInternImage-L1x54.947.7277M1399Gconfigckpt
CascadeInternImage-L3x56.148.5277M1399Gconfigckpt | log
CascadeInternImage-XL1x55.348.1387M1782Gconfigckpt | log
CascadeInternImage-XL3x56.248.8387M1782Gconfigckpt | log
methodbackboneschdbox mAP#paramConfigDownload
DINOInternImage-T1x53.949Mconfigckpt | log
DINOInternImage-L1x57.6241Mconfigckpt | log
DINOInternImage-H1x63.41.1Bconfigckpt
DINOCB-InternImage-H1x64.52.2Bconfigckpt
DINO (TTA)CB-InternImage-H1x65.02.2B-ckpt
DINOInternImage-G1x64.23.1Bconfigckpt
DINOCB-InternImage-G1x65.16B--
DINO (TTA)CB-InternImage-G1x65.36B--
Dataset: LVIS
methodbackboneminival (ss)val (ss/ms)#paramConfigDownload
DINOCB-InternImage-H65.862.3 / 63.22.18Bconfigckpt
Dataset: OpenImages
methodbackbonemAP (ss)#paramConfigDownload
DINOCB-InternImage-H74.12.18Bconfigckpt
Dataset: VOC 2007 & 2012
methodbackboneVOC 2007VOC 2012#paramConfigDownload
DINOCB-InternImage-H94.097.22.18Bconfigckpt

Evaluation

To evaluate our InternImage on COCO val, run:

sh dist_test.sh <config-file> <checkpoint> <gpu-num> --eval bbox segm

For example, to evaluate the InternImage-T with a single GPU:

python test.py configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py pretrained/mask_rcnn_internimage_t_fpn_1x_coco.pth --eval bbox segm

For example, to evaluate the InternImage-B with a single node with 8 GPUs:

sh dist_test.sh configs/coco/mask_rcnn_internimage_b_fpn_1x_coco.py pretrained/mask_rcnn_internimage_b_fpn_1x_coco.py 8 --eval bbox segm

Training

To train an InternImage on COCO, run:

sh dist_train.sh <config-file> <gpu-num>

For example, to train InternImage-T with 8 GPU on 1 node, run:

sh dist_train.sh configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py 8

Manage Jobs with Slurm

For example, to train InternImage-L with 32 GPU on 4 node, run:

GPUS=32 sh slurm_train.sh <partition> <job-name> configs/coco/cascade_internimage_xl_fpn_3x_coco.py work_dirs/cascade_internimage_xl_fpn_3x_coco

Export

Install mmdeploy at first:

pip install mmdeploy==0.14.0

To export a detection model from PyTorch to TensorRT, run:

MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth"

python deploy.py \
    "./deploy/configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py" \
    "./configs/coco/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.jpg" \
    --work-dir "./work_dirs/mmdet/instance-seg/${MODEL}" \
    --device cuda \
    --dump-info

For example, to export mask_rcnn_internimage_t_fpn_1x_coco from PyTorch to TensorRT, run:

MODEL="mask_rcnn_internimage_t_fpn_1x_coco"
CKPT_PATH="/path/to/model/ckpt/mask_rcnn_internimage_t_fpn_1x_coco.pth"

python deploy.py \
    "./deploy/configs/mmdet/instance-seg/instance-seg_tensorrt_dynamic-320x320-1344x1344.py" \
    "./configs/coco/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.jpg" \
    --work-dir "./work_dirs/mmdet/instance-seg/${MODEL}" \
    --device cuda \
    --dump-info