README.md

June 10, 2026 · View on GitHub

ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts [NeurIPS 2025]

Linfeng Tang^1*, Yeda Wang^1*, Zhanchuan Cai², Junjun Jiang³, Jiayi Ma^1†

¹Wuhan University ²Macau University of Science and Technology ³Harbin Institute of Technology
^*Equal Contribution ^†Corresponding Author

✨ News:

[2026-06-02] Our paper DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance has been officially accepted by IEEE Transactions on Image Processing (IEEE TIP)! [Paper] [arXiv] [Code]
[2026-02-21] Our paper VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion has been officially accepted by The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)! [Paper] [arXiv] [Code]
[2025-09-18] Our paper ControlFusion: A Controllable Image Fusion Framework with Language-Vision Degradation Prompts has been officially accepted by Advances in Neural Information Processing Systems (NeurIPS 2025)! [Paper] [Code]
[2025-09-10] Our paper Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion has been officially accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI)! [Paper] [Code]
[2025-03-15] Our paper C2RF: Bridging Multi-modal Image Registration and Fusion via Commonality Mining and Contrastive Learning has been officially accepted by the International Journal of Computer Vision (IJCV)! [Paper] [Code]
[2025-02-11] We released a large-scale dataset for infrared and visible video fusion: M3SVD: Multi-Modal Multi-Scene Video Dataset.

🔎 Method Overview

Motivation

ControlFusion

Framework

ControlFusion

Frequency Domain Comparison

ControlFusion

🔧 Environment Setup

Clone this repository:

git clone https://github.com/Linfeng-Tang/ControlFusion.git
cd ControlFusion

Create a Conda environment (recommended):

conda create -n controlfusion python=3.8 -y
conda activate controlfusion

Install dependency packages:
```
pip install -r requirements.txt
```

📂 Dataset Construction

please refer to genDateset

📂 Dataset Download

Google Drive

📥 Pre-trained Weights

Download the pretrained model Mask-DiFuser from Baidu Drive, and put the weight into `pretrained_weights/`.

🧪 Inference

You can use the test.py script we provide to fuse pairs of images. Please make sure you have downloaded the pre-trained weights. You can modify ControlFusion.py to select text/auto control by:

text_features = self.get_text_feature(text.expand(b, -1)).to(inp_img_A.dtype)
text_features = imgfeature

🚂 Train

You can use the train.py script we provide to train. Make sure you have organized your train dataset correctly.

📷 Results

Visualization of fusion results in different degraded scenarios

ControlFusion

Generalization results in the real world

ControlFusion

🕵️‍♂️ Detection

ControlFusion

🎓 Citations

If our work is useful for your research, please consider citing and give us a star ⭐:

@inproceedings{Tang2025ControlFusion,
  author={Linfeng Tang, Yeda Wang, Zhanchuan Cai, Junjun Jiang, and Jiayi Ma},
  title={ControlFusion: A Controllable Image Fusion Network with Language-Vision Degradation Prompts}, 
  booktitle={Advances in Neural Information Processing Systems},
  year={2025},
 }

🤝 Contact

Please feel free to contact: linfeng0419@gmail.com, wangyeda@whu.edu.cn. We are very pleased to communicate with you and will maintain this repository during our free time.

Download the pretrained model Mask-DiFuser from Baidu Drive, and put the weight into pretrained_weights/.

Download the pretrained model Mask-DiFuser from Baidu Drive, and put the weight into `pretrained_weights/`.