README.md

April 30, 2026 · View on GitHub

CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective

Zongheng Tang^1,2 Yi Liu² Yifan Sun² Yulu Gao^1,2 Jinyu Chen² Runsheng Xu³ Si Liu²

¹Hangzhou International Innovation Institute, Beihang University ²School of Artificial Intelligence, Beihang University ³University of California, Los Angeles

ICCV 2025 Highlight

CoST is an efficient collaborative perception framework that unifies spatial and temporal collaboration among connected agents to improve perception accuracy while reducing communication cost.

CoST aggregates observations from different agents and timestamps in a shared spatio-temporal space, enabling efficient feature transmission and unified spatio-temporal fusion.

📢 News

[2025-08-01] 🔥 CoST paper is released on arXiv.
[2025-08-01] 🏆 CoST is accepted as an ICCV 2025 Highlight paper.

💡 Highlights

Unified Spatiotemporal Perspective. CoST jointly models spatial collaboration across agents and temporal collaboration across frames in a unified spatio-temporal space.
Efficiency-Oriented Design. CoST reduces communication redundancy by avoiding repeated transmission of static object features while maintaining strong perception performance.
Flexible and General Framework. CoST is compatible with many previous collaborative perception methods and can improve accuracy while reducing transmission bandwidth.
Multi-Dataset Validation. CoST is evaluated on V2V4Real, DAIR-V2X, and V2XSet, demonstrating strong generalization across scenarios.

🛠️ Usage

This codebase is built upon V2V4Real. Please follow the instructions below to set up the environment, prepare datasets, and run training or evaluation.

Installation

We recommend using Conda for environment management.

1. Create Conda Environment

conda create -n v2v4real python=3.7 -y
conda activate v2v4real

2. Install PyTorch

PyTorch >= 1.12.0 is required. Example for CUDA 11.3:

conda install pytorch==1.12.0 torchvision==0.13.0 cudatoolkit=11.3 -c pytorch -c conda-forge

3. Install spconv 2.x

pip install spconv-cu113

4. Install Other Dependencies

pip install -r requirements.txt
python setup.py develop

5. Build CUDA Extension for Bounding Box NMS

python opencood/utils/setup.py build_ext --inplace

6. Install Deformable Convolution

cd opencood/models/sub_modules/ops
sh make.sh

Data Preparation

CoST currently supports the following datasets:

V2V4Real
DAIR-V2X
V2XSet

Please download the data in OPV2V format from the project website or the corresponding dataset pages. After downloading, organize the data as follows:

├── v2v4real
│   ├── train
│   │   ├── testoutput_CAV_data_2022-03-15-09-54-40_1
│   ├── validate
│   ├── test

Model Preparation

Pre-trained checkpoints and model zoo links are not included in the current README. If you release checkpoints later, you can add them here, for example:

checkpoints/
├── cost_v2v4real.pth
├── cost_dair_v2x.pth
└── cost_v2xset.pth

Training

Run the provided training script:

bash train.sh

Evaluation

Run the provided testing script:

bash test.sh

📝 Citation

If you find this work useful, please consider citing our paper:

@article{tang2025cost,
  title   = {CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective},
  author  = {Tang, Zongheng and Liu, Yi and Sun, Yifan and Gao, Yulu and Chen, Jinyu and Xu, Runsheng and Liu, Si},
  journal = {arXiv preprint arXiv:2508.00359},
  year    = {2025}
}

📄 License

This project is licensed under the Apache-2.0 License. See LICENSE for more information.

🙏 Acknowledgement

This project is built upon V2V4Real. We thank the authors and contributors of the open-source collaborative perception community.