DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

June 7, 2021 · View on GitHub

This repo is the official implementation of "DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion"

by Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, and Xi Li.

Prerequisites

  • Ubuntu 18
  • PyTorch 1.7.0
  • CUDA 10.1
  • Cudnn 7.5.1
  • Python 3.7
  • Numpy 1.17.3

Training

Please see launch_train.sh and launch_pretrain.sh for imagenet pretraining and sod training, respectively.

Testing

Please see launch_test.sh for testing on the sod benchmarks.

Main Results

DatasetErSλmeanFβmeanM
DUT-RGBD0.9500.9210.9260.030
NJUD0.9230.9030.9010.039
NLPR0.9500.9180.8970.024
SSD0.9040.8760.8520.045
STEREO0.9330.9040.8980.036
LFSD0.9230.8820.8820.054
RGBD1350.9620.9200.8960.021

Saliency maps and Evaluation

All of the saliency maps mentioned in the paper are available on GoogleDrive or BaiduYun(code:juc2).

You can use the toolbox provided by jiwei0921 for evaluation.

Additionally, we also provide the saliency maps of the STERE-1000 and SIP dataset on BaiduYun(code:qxfw) for easy comparison.

DatasetErSλmeanFβmeanM
STERE-10000.9280.8970.8950.038
SIP0.9080.8610.8680.057

Citation

@inproceedings{Sun2021DeepRS,
  title={Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion},
  author={P. Sun and Wenhu Zhang and Huanyu Wang and Songyuan Li and Xi Li},
  journal={IEEE Conf. Comput. Vis. Pattern Recog.},
  year={2021}
}

License

The code is released under MIT License (see LICENSE file for details).