๐ŸŒŠ [CVPR 2026 Oral] SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker

May 1, 2026 ยท View on GitHub

๐Ÿ“„ Paper ย |ย  ๐Ÿ“ฆ Models & Results: Google Drive ย /ย  Baidu Driveย /ย  Hugging Face


๐Ÿ”ฅ News

  • [May 1, 2026] The models, raw results, and training logs are now available on Hugging Face ๐Ÿค—.
  • [Apr 13, 2026] Code, models, and raw results are released.

๐Ÿง  Introduction

  • ๐ŸŒŠ A simple unified multimodal tracking framework for RGB-T, RGB-D, and RGB-E tasks.
  • ๐Ÿš€ Achieves strong performance across multiple multimodal benchmarks.
  • โšก Highly efficient: only 0.6M trainable parameters and 63.5 FPS.
  • ๐Ÿ” Highlights the importance of cross-modal alignment in multimodal tracking.

๐Ÿ“Š Results

Overall Performance

Visualization


โš™๏ธ Usage

๐Ÿ”ง Installation

conda env create -f environment.yaml
conda activate seatrack

๐Ÿ“‚ Data Preparation

Organize datasets as follows:

-- <DATA_PATH>
    -- DepthTrack/trainingset
        |-- adapter02_indoor
        |-- bag03_indoor
        |-- bag04_indoor
        ...
    -- LasHeR/trainingset
        |-- 1boygo
        |-- 1handsth
        ...
    -- VisEvent/trainingset
        |-- 00142_tank_outdoor2
        |-- 00143_tank_outdoor2
        ...

๐Ÿ›  Path Setting

cd <PATH_TO_SEATRACK>
python tracking/create_default_local_file.py \
  --workspace_dir . \
  --data_dir <DATA_PATH> \
  --save_dir ./output

Or manually modify:

./lib/train/admin/local.py # paths for training
./lib/test/evaluation/local.py # paths for testing

๐Ÿ‹๏ธ Training

Download pretrained OSTrack and place it under:

./pretrained/vitb_256_mae_32x4_ep300
./pretrained/vitb_256_mae_ce_32x4_ep300

Then run:

bash train.sh

๐Ÿงช Testing

Modify checkpoint in:

./lib/test/parameter/seatrack.py

RGB-D (DepthTrack & VOT22-RGBD)

Place datasets and the provided list.txt into:

./Depthtrack_workspace/sequences
./VOT22RGBD_workspace/sequences

Modify paths in

./Depthtrack_workspace/trackers.ini
./VOT22RGBD_workspace/trackers.ini

Run evaluation with VOT Toolkit:

bash eval_rgbd.sh

RGB-T (LasHeR & RGBT234)

Modify <DATASET_PATH> and <SAVE_PATH> in:

./RGBT_workspace/test_rgbt_mgpus.py

Then run:

bash eval_rgbt.sh

Evaluation tools:


RGB-E (VisEvent)

Modify <DATASET_PATH> and <SAVE_PATH> in:

./RGBE_workspace/test_rgbe_mgpus.py

Run:

bash eval_rgbe.sh

Evaluation:


๐Ÿ“– Citation

If you find this work helpful, please consider citing:

@misc{su2026seatracksimpleefficientadaptive,
      title={SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker}, 
      author={Junbin Su and Ziteng Xue and Shihui Zhang and Kun Chen and Weiming Hu and Zhipeng Zhang},
      year={2026},
      eprint={2604.12502},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2604.12502}, 
}

๐Ÿ™ Acknowledgment

SEATrack uses code from a few open source repositories. Without the efforts of these folks (and their willingness to release their implementations), SEATrack would not be possible. We thank these authors for their efforts!


๐Ÿ“ฌ Contact

If you have any questions, feel free to contact:

๐Ÿ“ง binbing2024@outlook.com