HeCoFuse: Cross-Modal Complementary V2X Cooperative Perception with Heterogeneous Sensors
July 30, 2025 Β· View on GitHub
Official implementation of the CVPR 2025 paper:
HeCoFuse: Cross-Modal Complementary V2X Cooperative Perception with Heterogeneous Sensors
Chuheng Wei*, Ziye Qin, Walter Zimmer, Guoyuan Wu, and Matthew J. Barth
*Corresponding author: chuheng.wei@email.ucr.edu
π 1st Place, CVPR 2025 DriveX Challenge
π§ Introduction
HeCoFuse is a unified cooperative perception framework designed for real-world heterogeneous V2X systems, where vehicles and infrastructure may be equipped with different combinations of sensors (LiDAR-only, Camera-only, or both). Unlike previous methods assuming uniform sensor setups, HeCoFuse:
- Supports 9 heterogeneous sensor configurations across vehicleβinfrastructure nodes
- Introduces Hierarchical Attention Fusion (HAF) to adaptively weight features by modality and spatial reliability
- Proposes Adaptive Spatial Resolution (ASR) to dynamically adjust feature resolution, reducing computation by up to 45%
- Uses a cooperative training strategy to generalize across diverse configurations
π Achieved 43.37% 3D mAP under L+LC setting and 43.22% in LC+LC, surpassing CoopDet3D baseline.
π οΈ Installation
Prerequisites
Please first install CoopDet3D following the official instructions:
- Install CoopDet3D: Follow the installation guide at CoopDet3D repository
- Setup Dataset: Download and prepare the TUMTraf-V2X dataset according to CoopDet3D requirements
HeCoFuse Installation
After successfully setting up CoopDet3D, install HeCoFuse components:
# Clone HeCoFuse repository
git clone https://github.com/your-repo/hecofuse.git
cd hecofuse
π Dataset: TUMTraf-V2X
We use the TUMTraf-V2X dataset, a real-world V2X dataset featuring:
- Synchronized vehicle and infrastructure data
- Multi-modal sensing (LiDAR & camera)
- 29K annotated 3D bounding boxes
- 8 object classes
**Note**: Please follow the [CoopDet3D installation guide](https://github.com/tum-traffic-dataset/coopdet3d) to properly install and setup the dataset.
---
## π Usage
### 1. Configuration Setup
Add the HeCoFuse configuration file to the CoopDet3D configs directory:
```bash
# Copy HeCoFuse.yaml to the appropriate config directory
cp HeCoFuse.yaml /path/to/coopdet3d/configs/tumtraf_v2x/det/transfusion/secfpn/cooperative/camera+lidar/yolov8/
2. Model Integration
Install HeCoFuse model components by copying the Python files to the corresponding MMDetection3D model subdirectories:
# Navigate to your CoopDet3D installation directory
cd /path/to/coopdet3d
# Copy model files to appropriate subdirectories under /mmdet3d/models/
# For example:
cp /path/to/hecofuse/models/fusion_models/*.py mmdet3d/models/coop_fusers/
cp /path/to/hecofuse/models/cooperative/*.py mmdet3d/models/coop_fusion_models/
cp /path/to/hecofuse/models/utils/*.py mmdet3d/models/fusion_models/
# Update __init__.py files in each subdirectory to register new modules
# Replace the existing __init__.py files with the updated versions provided
3. Directory Structure
After installation, your directory structure should look like:
/path/to/coopdet3d/
βββ configs/
β βββ tumtraf_v2x/det/transfusion/secfpn/cooperative/camera+lidar/yolov8/
β βββ HeCoFuse.yaml
βββ mmdet3d/
β βββ models/
β βββ coop_fusers/
β β βββ __init__.py (updated)
β β βββ heterogeneous_fuser.py
β βββ coop_fusion_models/
β β βββ __init__.py (updated)
β β βββ heterogeneous_bevfusion_coop.py
β βββ fusion_models/
β βββ __init__.py (updated)
β βββ pseudo_fusion.py
4. Training
# Train HeCoFuse model
python tools/train.py configs/tumtraf_v2x/det/transfusion/secfpn/cooperative/camera+lidar/yolov8/HeCoFuse.yaml
5. Evaluation
# Evaluate trained model
python tools/test.py configs/tumtraf_v2x/det/transfusion/secfpn/cooperative/camera+lidar/yolov8/HeCoFuse.yaml /path/to/checkpoint.pth
ποΈ Architecture
Hierarchical Attention Fusion (HAF)
- Cross-modal attention between LiDAR and camera features
- Spatial reliability weighting based on feature confidence
- Adaptive feature selection for optimal fusion
Adaptive Spatial Resolution (ASR)
- Dynamic resolution adjustment based on scene complexity
- Computational efficiency with up to 45% reduction
- Quality preservation through intelligent downsampling
Heterogeneous Configuration Support
Supports 9 different sensor configurations:
- Vehicle: L, C, LC
- Infrastructure: L, C, LC
- Combinations: L+L, L+C, L+LC, C+C, C+L, C+LC, LC+L, LC+C, LC+LC
π Results
π Citation
If you find HeCoFuse useful in your research, please consider citing:
@inproceedings{wei2025hecofuse,
title={HeCoFuse: Cross-Modal Complementary V2X Cooperative Perception with Heterogeneous Sensors},
author={Wei, Chuheng and Qin, Ziye and Zimmer, Walter and Wu, Guoyuan and Barth, Matthew J},
booktitle={2025 IEEE 28th international conference on intelligent transportation systems (ITSC)},
year={2025}
}
π€ Acknowledgments
This work builds upon CoopDet3D. We thank the authors for their excellent codebase and dataset.
π§ Contact
For questions and support, please contact:
- Chuheng Wei: chuheng.wei@email.ucr.edu
π License
This project is licensed under the MIT License - see the LICENSE file for details.