README.md
April 30, 2026 · View on GitHub
CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective
CoST is an efficient collaborative perception framework that unifies spatial and temporal collaboration among connected agents to improve perception accuracy while reducing communication cost.
CoST aggregates observations from different agents and timestamps in a shared spatio-temporal space, enabling efficient feature transmission and unified spatio-temporal fusion.
📢 News
- [2025-08-01] 🔥 CoST paper is released on arXiv.
- [2025-08-01] 🏆 CoST is accepted as an ICCV 2025 Highlight paper.
💡 Highlights
- Unified Spatiotemporal Perspective. CoST jointly models spatial collaboration across agents and temporal collaboration across frames in a unified spatio-temporal space.
- Efficiency-Oriented Design. CoST reduces communication redundancy by avoiding repeated transmission of static object features while maintaining strong perception performance.
- Flexible and General Framework. CoST is compatible with many previous collaborative perception methods and can improve accuracy while reducing transmission bandwidth.
- Multi-Dataset Validation. CoST is evaluated on V2V4Real, DAIR-V2X, and V2XSet, demonstrating strong generalization across scenarios.
🛠️ Usage
This codebase is built upon V2V4Real. Please follow the instructions below to set up the environment, prepare datasets, and run training or evaluation.
Installation
We recommend using Conda for environment management.
1. Create Conda Environment
conda create -n v2v4real python=3.7 -y
conda activate v2v4real
2. Install PyTorch
PyTorch >= 1.12.0 is required. Example for CUDA 11.3:
conda install pytorch==1.12.0 torchvision==0.13.0 cudatoolkit=11.3 -c pytorch -c conda-forge
3. Install spconv 2.x
pip install spconv-cu113
4. Install Other Dependencies
pip install -r requirements.txt
python setup.py develop
5. Build CUDA Extension for Bounding Box NMS
python opencood/utils/setup.py build_ext --inplace
6. Install Deformable Convolution
cd opencood/models/sub_modules/ops
sh make.sh
Data Preparation
CoST currently supports the following datasets:
- V2V4Real
- DAIR-V2X
- V2XSet
Please download the data in OPV2V format from the project website or the corresponding dataset pages. After downloading, organize the data as follows:
├── v2v4real
│ ├── train
│ │ ├── testoutput_CAV_data_2022-03-15-09-54-40_1
│ ├── validate
│ ├── test
Model Preparation
Pre-trained checkpoints and model zoo links are not included in the current README. If you release checkpoints later, you can add them here, for example:
checkpoints/
├── cost_v2v4real.pth
├── cost_dair_v2x.pth
└── cost_v2xset.pth
Training
Run the provided training script:
bash train.sh
Evaluation
Run the provided testing script:
bash test.sh
📝 Citation
If you find this work useful, please consider citing our paper:
@article{tang2025cost,
title = {CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective},
author = {Tang, Zongheng and Liu, Yi and Sun, Yifan and Gao, Yulu and Chen, Jinyu and Xu, Runsheng and Liu, Si},
journal = {arXiv preprint arXiv:2508.00359},
year = {2025}
}
📄 License
This project is licensed under the Apache-2.0 License. See LICENSE for more information.
🙏 Acknowledgement
This project is built upon V2V4Real. We thank the authors and contributors of the open-source collaborative perception community.