SMSTracker: Tri-path Score Mask Sigma Fusion for Multi-Modal Tracking [iccv2025]
July 19, 2025 ยท View on GitHub
Official implementation of SMSTracker, including models and training&testing codes.
Models & Raw Results(Google Drive)

Introduction
- Wepropose anovel tri-path scoremask sigma fusion framework for multi-modal tracking, SMSTracker. It aims to effectively extract and fuse RGB with other modal features under complex conditions, providing a reliable foundation for tracking applications.
- We design the SMF module to evaluate the reliability of features from each modality, which enables optimal exploitation of complementary features between modalities.
- We propose the SGI module to optimize feature interaction and fusion for facilitating feature sharing and refinement across the tri-path branches, thereby enhancing cross-feature integration.
- We introduce the DKF strategy to fine-tune the model, prevent overfitting from excessive information, address unequal data contribution, and improve the model's understanding of modal information.

Results
On RGBT Dataset

On RGBD Dataset

On RGBE Dataset

Usage
Installation
Create and activate a conda environment:
conda create -n SMSTracker python=3.8
conda activate SMSTracker
Install pytorch
conda install pytorch torchvision torchaudio cudatoolkit=11.8
Install Mamba
cd lib/models/layer/selective_scan && pip install . && cd ../../../..
Data preparation
Download the datasets and put them in anywhere you like, then modify the dataset path in the config file.
change the dataset.{LasHeR,VisEvent,DepthTrack}.{train,val,test}.path in './lib/config/*.yaml' to your dataset path.
change these point in lib/config/*.yaml
- workspace.dir # the path to save the model and log
- workspace.log_file # spicify the log file name and path
- test.checkpoint # the checkpoint file path use for testing
- analysis.* # the path to save the analysis results (RGBT,RGBE)
Training
Dowmload the pretrained foundation model (OSTrack) and put it under ./pretrained/. and change ./train/*.py change the line 64
cd ./scripts/* # choose a training script
bash train.sh
You can train models with various modalities and variants by modifying ./config/*.yaml and ./train/*.py.
Testing
For RGB-D benchmarks
[DepthTrack Test set & VOT22_RGBD]
These two benchmarks are evaluated using VOT-toolkit.
You need to put the DepthTrack test set to./Depthtrack_workspace/ and name it 'sequences'.
You need to download the corresponding test sequences at./vot22_RGBD_workspace/.
bash eval_rgbd.sh
For RGB-T benchmarks
[LasHeR & RGBT234]
Modify the <DATASET_PATH> and <SAVE_PATH> in./RGBT_workspace/test_rgbt_mgpus.py, then run:
bash eval_rgbt.sh
We refer you to LasHeR Toolkit for LasHeR evaluation, and refer you to MPR_MSR_Evaluation for RGBT234 evaluation.
For RGB-E benchmark
[VisEvent]
Modify the <DATASET_PATH> and <SAVE_PATH> in./RGBE_workspace/test_rgbe_mgpus.py, then run:
bash eval_rgbe.sh
We refer you to VisEvent_SOT_Benchmark for evaluation.
Acknowledgment
- This repo is based on OSTrack which is an excellent work.
- We thank for the PyTracking library, which helps us to quickly implement our ideas.
- We Thank for the ViPT and Sigma, which are excellent and inspiring works.