๐ [CVPR 2026 Oral] SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker
May 1, 2026 ยท View on GitHub
๐ Paper ย |ย ๐ฆ Models & Results: Google Drive ย /ย Baidu Driveย /ย Hugging Face
๐ฅ News
- [May 1, 2026] The models, raw results, and training logs are now available on Hugging Face ๐ค.
- [Apr 13, 2026] Code, models, and raw results are released.
๐ง Introduction
- ๐ A simple unified multimodal tracking framework for RGB-T, RGB-D, and RGB-E tasks.
- ๐ Achieves strong performance across multiple multimodal benchmarks.
- โก Highly efficient: only 0.6M trainable parameters and 63.5 FPS.
- ๐ Highlights the importance of cross-modal alignment in multimodal tracking.
๐ Results
Overall Performance
Visualization
โ๏ธ Usage
๐ง Installation
conda env create -f environment.yaml
conda activate seatrack
๐ Data Preparation
- LasHeR
- RGBT234 (qvsq)
- DepthTrack
- VOT22-RGBD
- VisEvent
Organize datasets as follows:
-- <DATA_PATH>
-- DepthTrack/trainingset
|-- adapter02_indoor
|-- bag03_indoor
|-- bag04_indoor
...
-- LasHeR/trainingset
|-- 1boygo
|-- 1handsth
...
-- VisEvent/trainingset
|-- 00142_tank_outdoor2
|-- 00143_tank_outdoor2
...
๐ Path Setting
cd <PATH_TO_SEATRACK>
python tracking/create_default_local_file.py \
--workspace_dir . \
--data_dir <DATA_PATH> \
--save_dir ./output
Or manually modify:
./lib/train/admin/local.py # paths for training
./lib/test/evaluation/local.py # paths for testing
๐๏ธ Training
Download pretrained OSTrack and place it under:
./pretrained/vitb_256_mae_32x4_ep300
./pretrained/vitb_256_mae_ce_32x4_ep300
Then run:
bash train.sh
๐งช Testing
Modify checkpoint in:
./lib/test/parameter/seatrack.py
RGB-D (DepthTrack & VOT22-RGBD)
Place datasets and the provided list.txt into:
./Depthtrack_workspace/sequences
./VOT22RGBD_workspace/sequences
Modify paths in
./Depthtrack_workspace/trackers.ini
./VOT22RGBD_workspace/trackers.ini
Run evaluation with VOT Toolkit:
bash eval_rgbd.sh
RGB-T (LasHeR & RGBT234)
Modify <DATASET_PATH> and <SAVE_PATH> in:
./RGBT_workspace/test_rgbt_mgpus.py
Then run:
bash eval_rgbt.sh
Evaluation tools:
RGB-E (VisEvent)
Modify <DATASET_PATH> and <SAVE_PATH> in:
./RGBE_workspace/test_rgbe_mgpus.py
Run:
bash eval_rgbe.sh
Evaluation:
๐ Citation
If you find this work helpful, please consider citing:
@misc{su2026seatracksimpleefficientadaptive,
title={SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker},
author={Junbin Su and Ziteng Xue and Shihui Zhang and Kun Chen and Weiming Hu and Zhipeng Zhang},
year={2026},
eprint={2604.12502},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.12502},
}
๐ Acknowledgment
SEATrack uses code from a few open source repositories. Without the efforts of these folks (and their willingness to release their implementations), SEATrack would not be possible. We thank these authors for their efforts!
๐ฌ Contact
If you have any questions, feel free to contact: