C3Net: Context-Contrast Network for Camouflaged Object Detection
November 24, 2025 · View on GitHub
C3Net: Context-Contrast Network for Camouflaged Object Detection Baber Jan1,2, Aiman H. El-Maleh1, Abdul Jabbar Siddiqui1, Abdul Bais3, Saeed Anwar4 1King Fahd University of Petroleum and Minerals 2SDAIA-KFUPM Joint Research Center for Artificial Intelligence 3University of Regina 4University of Western Australia
Submitted to IEEE Transactions on Artificial Intelligence (TAI)
Overview
Camouflaged objects blend with surroundings through similar colors, textures, and patterns. Detection is challenging because traditional methods and foundation models fail on these objects. We identify six challenges: Intrinsic Similarity (IS), Edge Disruption (ED), Extreme Scale Variation (ESV), Environmental Complexities (EC), Contextual Dependencies (CD), and Salient-Camouflaged Object Disambiguation (SCOD). C3Net uses dual-pathway architecture to address all challenges. The Edge Refinement Pathway (ERP) processes early features with gradient-initialized modules for boundaries. The Contextual Localization Pathway (CLP) processes deep features with Image-based Context Guidance (ICG) for intrinsic saliency suppression without external models. The Attentive Fusion Module (AFM) combines both pathways through spatial gating. C3Net achieves 0.898 S-measure on COD10K, 0.904 on CAMO, and 0.913 on NC4K while maintaining efficient real-time processing.
Key Contributions
- Dual-pathway decoder separating edge refinement from contextual localization prevents signal dilution
- Image-based Context Guidance (ICG) for intrinsic saliency suppression without external models
- Gradient-initialized Edge Enhancement Modules preserving classical edge detection principles while enabling learning
- Attentive Fusion Module for synergistic pathway integration through spatial attention gating
Architecture
C3Net architecture with dual-pathway decoder: Edge Refinement Pathway (ERP) processes early features for boundaries, Contextual Localization Pathway (CLP) with ICG mechanism handles semantic understanding, and Attentive Fusion Module (AFM) combines both pathways
C3Net employs a dual-pathway decoder with three main components:
1. Edge Refinement Pathway (ERP)
Processes early encoder features (Stages 1 & 2) to recover precise object boundaries through cascaded Edge Enhancement Modules (EEMs). Each EEM employs multi-path convolutions initialized from Sobel and Laplacian operators, maintaining classical edge detection principles while adapting to camouflage patterns through learning.
2. Contextual Localization Pathway (CLP)
Processes deep encoder features (Stages 23 & 24) through Semantic Enhancement Units (SEUs) and our novel Image-based Context Guidance (ICG) mechanism. The ICG performs:
- Appearance Analysis: Direct extraction from input image
- Guided Contrast Module (GCM): Foreground-background differentiation using spatial aggregation and global context
- Iterative Attention Gating: Progressive saliency suppression through two-stage refinement
3. Attentive Fusion Module (AFM)
Synergistically combines ERP and CLP outputs through spatial attention gating, where contextual features spatially modulate edge information, emphasizing relevant boundaries while suppressing distractors.
Performance
| Dataset | Sα ↑ | Fβw ↑ | Fβm ↑ | Eφ ↑ | MAE ↓ |
|---|---|---|---|---|---|
| COD10K | 0.898 | 0.851 | 0.859 | 0.961 | 0.0162 |
| CAMO | 0.904 | 0.889 | 0.896 | 0.951 | 0.0311 |
| NC4K | 0.913 | 0.895 | 0.903 | 0.958 | 0.0220 |
C3Net achieves state-of-the-art on COD10K while outperforming previous best by 2.6% in Sα and 2.3% in Eφ. The model leads on CAMO with best Sα of 0.904 and MAE of 0.0311, and maintains top performance on NC4K with 0.913 Sα. Training uses 200 epochs with batch size 128 on NVIDIA H100 GPUs with efficient real-time inference on 392×392 resolution.
Visual comparison of C3Net with state-of-the-art methods on challenging COD cases. Each row exemplifies a specific challenge: (i) Intrinsic Similarity (IS), (ii) Edge Disruption (ED), (iii) Contextual Dependencies (CD), (iv) Multiple Instances, (v) Environmental Complexities (EC), (vi) Small Objects (ESV aspect), (vii) Large Objects (ESV aspect), and (viii) Salient-Camouflaged Object Disambiguation (SCOD). For each row, columns are: (a) Input Image, (b) OCENet, (c) BGNet, (d) ZoomNet, (e) SINetV2, (f) FSPNet, (g) FEDER, (h) C3Net (Ours), and (i) Ground Truth.
Getting Started
Prerequisites
- Python 3.8+
- PyTorch 2.0+
- CUDA 11.8+
Installation
- Clone the repository:
git clone https://github.com/Baber-Jan/C3Net.git
cd C3Net
- Run the setup script which handles environment creation and dataset validation:
chmod +x setup/setup.sh
./setup/setup.sh
The setup script performs:
- Creates conda environment with required dependencies
- Validates dataset organization if present
- Generates edge maps for training datasets if needed
Alternatively, manually create the environment:
conda env create -f setup/environment.yml
conda activate c3net
Project Structure
C3Net/
├── configs/
│ └── default.yaml # Training configuration
├── checkpoints/ # Model weights
│ └── model_best.pth # Pre-trained C3Net model
├── datasets/ # Dataset directory
│ ├── COD10K/
│ ├── CAMO/
│ └── NC4K/
├── models/
│ ├── c3net.py # Main C3Net architecture
│ ├── backbone.py # DINOv2 encoder
│ ├── edge_branch.py # Edge Refinement Pathway (ERP)
│ ├── localization_branch.py # Contextual Localization Pathway (CLP)
│ ├── fusion_head.py # Attentive Fusion Module (AFM)
│ └── utils.py # Model utilities
├── utils/
│ ├── data_loader.py # Dataset loading
│ ├── distributed_data_loader.py # Multi-GPU data loading
│ ├── distributed_utils.py # Distributed training utilities
│ ├── edge_generator.py # Edge map generation
│ ├── image_processor.py # Image preprocessing
│ ├── loss_functions.py # Multi-scale loss functions
│ ├── metrics.py # Evaluation metrics
│ ├── run_manager.py # Result directory management
│ └── visualization.py # Result visualization
├── engine/
│ ├── trainer.py # Training engine
│ ├── evaluator.py # Evaluation engine
│ ├── predictor.py # Prediction engine
│ └── distributed_trainer.py # Multi-GPU support
├── setup/
│ ├── environment.yml # Conda environment
│ └── setup.sh # Setup script
├── slurm_scripts/
│ └── train_multi_gpu.slurm # SLURM cluster training script
└── main.py # Entry point
Dataset Preparation
Download our COD datasets package containing CAMO, COD10K, and NC4K:
- Download Datasets (Google Drive) (~2.19GB)
After downloading, extract the datasets to the datasets/ folder in your C3Net root directory:
unzip datasets.zip -d C3Net/
Expected directory structure:
C3Net/
├── configs/
├── datasets/ # Extract datasets here
│ ├── CAMO/
│ │ ├── train/
│ │ │ ├── Imgs/ # Training images
│ │ │ ├── GT/ # Ground truth masks
│ │ │ └── Edges/ # Edge maps (auto-generated)
│ │ └── test/
│ │ ├── Imgs/
│ │ └── GT/
│ ├── COD10K/
│ │ ├── train/
│ │ │ ├── Imgs/
│ │ │ ├── GT/
│ │ │ └── Edges/
│ │ └── test/
│ │ ├── Imgs/
│ │ └── GT/
│ └── NC4K/
│ └── test/
│ ├── Imgs/
│ └── GT/
├── checkpoints/
└── ...
Pre-trained Models
-
C3Net Model Checkpoint: Download (Google Drive)
- Trained C3Net model for evaluation/inference
- Place in:
checkpoints/model_best.pth
-
Test Predictions
Following are segmentation masks generated by C3Net (reported in the paper) for each test dataset:
- CAMO (250 predictions): Download (Google Drive)
- COD10K (2,026 predictions): Download (Google Drive)
- NC4K (4,121 predictions): Download (Google Drive)
Usage
Configuration
All training configurations are specified in configs/default.yaml - review and modify hyperparameters, loss weights, and model settings as needed.
Training
Train C3Net on COD10K + CAMO:
python main.py train --config configs/default.yaml
Training features:
- Automatic mixed precision (AMP) for efficient GPU utilization
- Multi-scale supervision with pathway-specific objectives
- Regular checkpointing and validation
- Early stopping with configurable patience
Evaluation
Evaluate on test datasets:
python main.py evaluate --model checkpoints/model_best.pth
Evaluation provides:
- Standard COD metrics (Sα, Fβw, Fβm, Eφ, MAE)
- Quality-based result categorization
- Per-dataset performance analysis
Prediction
For inference on new images:
python main.py predict \
--model checkpoints/model_best.pth \
--input path/to/image
The prediction outputs include:
- Binary segmentation mask
- Confidence heatmap
- Edge map
- Original image overlay
Distributed Training (Multi-GPU)
For multi-GPU training using PyTorch DDP:
# Single-node multi-GPU (e.g., 8 GPUs)
python -m torch.distributed.run \
--nproc_per_node=8 \
--master_port=29500 \
engine/distributed_trainer.py \
--config configs/default.yaml
SLURM Cluster:
sbatch slurm_scripts/train_multi_gpu.slurm
Customize the SLURM script for your cluster (partition name, GPU type, paths).
Hardware Requirements
C3Net has been tested on:
- NVIDIA H100 GPU (optimal setup)
- CUDA 11.8 with PyTorch 2.1.0
- Recommended: 16GB+ GPU memory for training
The model supports various GPU configurations through adjustable batch sizes and memory optimization settings.
Citation
If you find C3Net useful in your research, please cite:
@article{jan2025c3net,
title={C3Net: Context-Contrast Network for Camouflaged Object Detection},
author={Jan, Baber and El-Maleh, Aiman H. and Siddiqui, Abdul Jabbar and Bais, Abdul and Anwar, Saeed},
eprint={2511.12627},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2511.12627},
}
Acknowledgments
This research was conducted at the SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum and Minerals. We gratefully acknowledge:
- King Fahd University of Petroleum and Minerals (KFUPM) for institutional support
- SDAIA-KFUPM Joint Research Center for Artificial Intelligence (JRCAI) for computational resources
- DINOv2 for robust vision transformer features
- DySample for content-adaptive upsampling
- PySODMetrics for evaluation framework
- Authors of COD benchmark datasets (COD10K, CAMO, NC4K)
License
This project is licensed under the MIT License - see LICENSE for details.
Usage Terms
- Open source for research and non-commercial use
- Commercial use requires explicit permission
- Attribution required when using or adapting the code
The benchmark datasets (COD10K, CAMO, NC4K) maintain their original licenses. Please refer to their respective papers for terms of use.
Contact
For questions or issues:
- GitHub Issues: Create an issue
- Email: baberjan008@gmail.com
- Research Collaborations: Contact Baber Jan at baberjan008@gmail.com
Note: This repository contains the official implementation of C3Net submitted to IEEE Transactions on Artificial Intelligence (TAI).