README.md

July 19, 2026 · View on GitHub

Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion

Linfeng Tang^1,* Chunyu Li^1,* Jiayi Ma^1,†

¹Wuhan University
^*Equal Contribution ^†Corresponding Author

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) · 2026

_{Google Scholar · 51 citations · updated July 18, 2026}

✨ News:

[2026-06-02] Our paper DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance has been officially accepted by IEEE Transactions on Image Processing (IEEE TIP)! [Paper] [arXiv] [Code]
[2026-02-21] Our paper VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion has been officially accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)! [Paper] [arXiv] [Code]
[2025-09-18] Our paper ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts has been officially accepted by Advances in Neural Information Processing Systems (NeurIPS 2025)! [Paper] [Code]
[2025-09-10] Our paper Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion has been officially accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)! [Paper] [Code]
[2025-03-15] Our paper C2RF: Bridging Multi-modal Image Registration and Fusion via Commonality Mining and Contrastive Learning has been officially accepted by the International Journal of Computer Vision (IJCV)! [Paper] [Code]
[2025-02-11] We released a large-scale dataset for infrared and visible video fusion: M3SVD: Multi-Modal Multi-Scene Video Dataset.

🔎 Method Overview

Various Scheme Comparison

Mask-DiFuser

Framework

Mask-DiFuser

Vanilla masking scheme vs. our dual masking scheme.

Mask-DiFuser

⚙️ Installation

# git clone this repository
git clone https://github.com/Linfeng-Tang/Mask-DiFuser.git
cd Mask-DiFuser

# create an environment with python >= 3.8
conda create -n mask-difuser python=3.8
conda activate mask-difuser
pip install -r requirements.txt

🚀 Inference

Step 1: Download the pretrained model Mask-DiFuser from Baidu Drive or Google Drive, and put the weight into `checkpoint/`.

Step 2: Running inference command

python test.py --pretrained_path ./checkpoint/model.pt --task_type VIF --dirA ./dataset/MSRS/ir --dirB ./dataset/MSRS/vi --output_path ./Fusion/MSRS --gpu_ids 0

🔥 Train

Step1: Pretrained models and training data

Please download DIV2K dataset from the official DIV2K Website, structured as follows:

/dataset/DIV2K/
        ├── train/       
        │   ├── 0001.png
        │   ├── 0002.png
        │   └── ...
        ├── val/    
        │   ├── 0001.png
        │   ├── 0002.png
        │   └── ...

Step2: Run code

export OMP_NUM_THREADS=1
torchrun --nproc-per-node=4 train.py --dataset_path ./dataset/DIV2K --output_path ./result --gpu_ids 0,1,2,3

📷 Results

Visual comparison of infrared-visible image fusion results for night scenes on the MSRS dataset

Mask-DiFuser

Visual comparison of infrared-visible image fusion results on the RoadScene dataset

Mask-DiFuser1

Visual comparison of multi-exposure image fusion results on the SICE dataset

Mask-DiFuser2

Visual comparison of multi-exposure image fusion results on the MEFB dataset

Mask-DiFuser3

Visual comparison of medical image fusion results on the Harvard dataset

Mask-DiFuser4

Visual comparison of near-infrared and visible image fusion results on the Nirscene dataset

Mask-DiFuser7

Visual comparison of multi-polarization fusion results on the Polarization dataset

Mask-DiFuser5

Visual comparison of multi-focus image fusion results on the Lytro dataset

Mask-DiFuser6

🕵️‍♂️ Detection

Mask-DiFuser

🎥 Segment

Mask-DiFuser

🎓 Citations

If our work is useful for your research, please consider citing and give us a star ⭐:

@article{Tang2026Mask-DiFuser,
  author={Tang, Linfeng and Li, Chunyu and Ma, Jiayi},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion}, 
  year={2026},
  volume={48},
  number={1},
  pages={591--608},
}

🤝 Contact

Please feel free to contact: linfeng0419@gmail.com, licy0089@gmail.com. We are very pleased to communicate with you and will maintain this repository during our free time.

❤️ Acknowledgments

Some codes are brought from CLEDiffusion, Stable-Diffusion. Thanks for their excellent works.

Step 1: Download the pretrained model Mask-DiFuser from Baidu Drive or Google Drive, and put the weight into checkpoint/.

Step 1: Download the pretrained model Mask-DiFuser from Baidu Drive or Google Drive, and put the weight into `checkpoint/`.