README.md

November 9, 2025 · View on GitHub

Toward Realistic Camouflaged Object Detection: Benchmarks and Method

Google

Baidu Extract Code: 93yd

The first version of the datasets contain category and bounding box annotations.

DatasetsCategoriesTraining ImagesTest Images
COD10K-D6860004000
NC4K-D3728631227
CAMO-D43744497

Benchmarks of GLIP and Groudning DINO on COD10K-D, NC4K-D, and CAMO-D datasets (First Version)

Caption: Performance of various detection methods on COD10K-D, NC4K-D, and CAMO-D datasets. The results for a single seed are presented here. The paper presents 10 random seed results.

Generic Methods

MethodBackbonemAPAP50AP75APmAPlmAPAP50AP75APmAPlmAPAP50AP75APmAPl
COD10K-DNC4K-DCAMO-D
YOLOv7-LCSPDarknet3.88.22.81.04.06.814.26.01.77.35.410.25.58.55.4
Faster R-CNNResNet-508.321.05.04.88.819.239.716.08.120.24.712.42.23.55.3
YOLOv8-LCSPVoVNet9.716.89.42.610.423.534.925.210.624.725.437.226.114.426.7
Faster R-CNNResNet-10110.824.47.79.211.623.047.220.110.424.09.321.16.99.910.1
Def-DETRResNet-5012.223.111.46.513.127.449.627.914.029.713.326.912.49.714.1
Def-DETRResNet-10113.523.713.59.214.630.954.432.012.432.513.727.412.713.215.5
Cascade R-CNNResNet-10115.327.415.98.516.427.546.928.911.329.214.026.613.113.114.8
Faster R-CNNSwin-T16.335.313.18.617.429.158.825.616.830.411.332.35.58.612.0
Faster R-CNNSwin-L32.154.633.117.133.949.175.855.122.751.334.267.430.224.036.1

Large Vision-Language Models

MethodBackbonemAPAP50AP75APmAPlmAPAP50AP75APmAPlmAPAP50AP75APmAPl
COD10K-DNC4K-DCAMO-D
GLIPSwin-T26.436.328.514.728.049.663.753.423.951.632.642.933.640.935.1
GLIP + CAFRSwin-T28.838.231.016.430.653.367.755.227.055.734.743.534.336.938.5
GLIPSwin-L40.247.943.524.742.376.986.980.950.478.563.074.468.152.466.8
GLIP + CAFRSwin-L42.950.045.828.343.078.789.983.953.580.563.677.370.550.069.8
GDinoSwin-T44.856.047.923.547.869.881.072.137.572.448.059.152.440.752.2
GDino + CAFRSwin-T48.560.751.928.749.772.382.774.535.674.750.760.755.345.055.2
GDinoSwin-B58.770.963.123.662.379.990.584.654.881.568.680.675.155.773.0
GDino + CAFRSwin-B62.374.868.335.467.181.992.987.358.183.772.183.377.256.974.5

New datasets will be provided after the paper is published.

Legend:

  • B-Boxes: Bounding Boxes
  • Pos-Classes: positive classes
  • Tr-Samples: training samples
  • Te-Samples: Test Samples
Dataset TypeDatasetsCategoryBoxDescriptionB-BoxesPos-ClassesLanguagesTr-SamplesTe-Samples
CODCHAMELEON----76
CODCAMO-8-1,000250
CODNC4K----4,121
CODCOD10K5,89969-6,0004,000
RCODCOD10K-D11,6848110,7986,1725,734
RCODRCOD-D12,9555911,8504,1925,846

Framework install

Our code is based on MMDetection. Here, for the convenience of readers, we have uploaded the full code of mmdetection and our code. If the relevant environment for mmdetection is configured on your server, you can download and use it directly. MMDetection is an open source object detection toolbox based on PyTorch. We adopt MMDetection as our baseline framework from MMdetection

Our environmental installation

  • Linux with Python >= 3.10
  • conda create -n RCOD python==3.10
  • conda activate RCOD
  • PyTorch >= 2.1.1 & torchvision that matches the PyTorch version.
  • Our CUDA is 11.8
  • Install PyTorch 2.1.1 with CUDA 11.8
    conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=11.8 -c pytorch -c nvidia
    
  • pip install mmcv>=2.2.0
  • pip install -r requirements/build.txt
  • pip install -v -e .

Training on CAFR

  • We provide the config files of the three datasets together, thus the number of categories in the config file and the path of the dataset needed to be changed during training. Here, data modification includes:
  RCOD/mmdet/datasets/coco.py  
  RCOD/configs/_base_/coco_detection.py
  • We use GLIP+APG as an example to show the training processing:
    CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./tools/dist_train.sh "--config configs/glip/glip_swin_tiny_cafr.py --work-dir /home/output 4
    

Citation

If you use this toolbox or benchmark datasets in your research, please cite this project.

@article{rcod,
	title={Toward Realistic Camouflaged Object Detection: Benchmarks and Method},
	author={Xin, Zhimeng and Wu, Tianxu and Chen, Shiming and Ye, Shuo and Xie, Zijing and Zou, Yixiong and You, Xinge and Guo, Yufei},
	journal={arXiv preprint arXiv:2501.07297},
	year={2025}
}