[ICCV 2025] TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras
October 20, 2025 · View on GitHub
Official implementation of “Temporally-Enhanced Self-Supervised Pretraining for Event Cameras.”
Mohammad Mohammadi,
Ziyi Wu,
Igor Gilitschenski
🧩 TL;DR
We introduce a new recurrent pretraining paradigm with a specialized reconstruction target designed for event-camera data. Our pretraining method demonstrates state-of-the-art performance across five downstream datasets (MVSEC, DDD17, DSEC, Gen1, 1Mpx) in three perception tasks (Semantic Segmentation, Object Detection, and Monocular Depth Estimation), for which the codebases and checkpoints are provided in this repository. For more information, please check out our paper.
If you find this work useful, please cite:
@article{mohammadi2025tespec,
title={TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras},
author={Mohammadi, Mohammad and Wu, Ziyi and Gilitschenski, Igor},
journal={arXiv preprint arXiv:2508.00913},
year={2025}
}
Overview
This repository provides the complete codebase and pretrained checkpoints used in our ICCV 2025 paper. It consists of four modular subprojects covering both pretraining and downstream tasks:
| Subproject | Description |
|---|---|
pretraining/ | Self-supervised pretraining pipeline (TESPEC) for event sequences |
semantic_segmentation/ | Semantic segmentation (DSEC + DDD17 datasets) |
depth_estimation/ | Monocular depth estimation (MVSEC dataset) |
object_detection/ | Object detection (Gen1 + 1Mpx datasets) |
Please refer to each subfolder for training and evaluation scripts.
⚙️ Installation
We recommend using Python 3.11 with CUDA ≥ 12.1.
# Clone the repository
git clone https://github.com/tisl-toronto/TESPEC.git
cd TESPEC
# Create and activate a Python 3.11 virtual environment
python3.11 -m venv tespec
source tespec/bin/activate
pip install torch==2.5.0 torchvision==0.20.0 --index-url https://download.pytorch.org/whl/cu121
pip install lightning==2.5.5 torchdata==0.9.0 hydra-core==1.3.2
pip install opencv-python==4.12.0.88 matplotlib==3.10.7 einops==0.8.1 timm==1.0.20 kornia==0.8.1
pip install h5py==3.15.1 hdf5plugin==6.0.0 pandas==2.3.3 scikit-learn==1.7.2 scikit-image==0.25.2 numba==0.62.1
pip install wandb==0.22.2 plotly==6.3.1 tabulate==0.9.0
pip install pycocotools==2.0.10 bbox-visualizer==0.2.2
💾 Checkpoints
| Model | Dataset | Task | Download |
|---|---|---|---|
| TESPEC_pretrained | 1Mpx | Self-Supervised Pretraining | Hugging Face |
| TESPEC-MVSEC | MVSEC | Depth Estimation | Hugging Face |
| TESPEC-DSEC | DSEC | Semantic Segmentation | Hugging Face |
| TESPEC-DDD17 | DDD17 | Semantic Segmentation | Hugging Face |
| TESPEC-Gen1 | Gen1 | Object Detection | Hugging Face |
| TESPEC-Gen4 | 1Mpx | Object Detection | Hugging Face |
To start training or evaluation, please refer to the corresponding subfolder.
Code Acknowledgments
We build upon several excellent open-source codebases: RVT, ESS, HMNet, MiDaS, and Timm.
📝 License
Released under the MIT License. © 2025 Toronto Intelligent Systems Lab, University of Toronto.
💬 Contact
For questions or issues, please open a GitHub Issue or contact: 📧 mohammadi@cs.toronto.edu