Multi-V2X: A Large Scale Multi-modal Multi-penetration-rate Dataset for Cooperative Perception

August 3, 2025 · View on GitHub

For data collection by CARLA-SUMO co-simulation, the code is integreted in https://github.com/RadetzkyLi/CoRTSG .

For generation of sub-dataset with specific CAV penetration rate (obtain the pr config), see Multi-V2X/scripts/vary_pr.ipynb .

For training and testing of cooperative perception algorithms, the code is integreted in https://github.com/RadetzkyLi/OpenCOOD .

The arxiv paper is published here.

Multi-V2X is a large-scale, multi-modal, multi-penetration-rate dataset for cooperative perception under vehicle-to-everything (V2X) environment. Multi-V2X is gathered by SUMO-CARLA co-simulation and supports tasks including 3D object detection and tracking. Multi-V2X provides RGB images, point clouds from CAVs and RSUs with various CAV penetration rates (up to 86.21%).

Features:

  • Multiple Penetration Rates: nearly all cars in a town are equipped with sensors (thus can be CAV). By masking some equipped cars as normal vehicles, datasets of various penetration rates can be generated.
  • Multiple Categories: 6 categories: car, van, truck, cycle, motorcycle, pedestrian, covering the common traffic participants. As comparison, the well-know OPV2V, V2XSet and V2X-Sim contain only car.
  • Multiple CAV Shapes: all kinds of cars in CARLA can be CAVs, whereas only lincoln in OPV2V and V2XSet.
  • Multiple Modalities: RGB images and point clouds from CAVs and RSUs are provided.
  • Dynamic Connections: CAVs are spawned and running in the whole town and thus connections would be lost and created over time. This is more realistic compared to spawning and running vehicles around a site.

Data Collection

Maps

Currently, we consider Town01, Town03, Town05, Town06, Town07, Town10HD. These towns cover a variety of road types, e.g, road segment, mid-block, T-junction, crossroad, rural road, etc.

Sensors

The agents are equipped with various sensors to capture surrounding environment. By now, cars and road side unit (RSU) are considered as agent (truck, motorcycle, etc., are excluded). All sensors stream at 20Hz but record at 10Hz.

CAV

One CAV is equipped with the following 7 sensors:

  • RGB camera x 4: 110° FOV, 800x600 resolution, position: top of the car, pose: front, rear, left, right.

  • LiDAR x1: 120 m detection range, 64 channels, 130,0000 points per second, 40° vertical FOV (-30 ~ 10), 20Hz rotation frequency.

  • Semantic LiDAR x1: same as LiDAR

  • GNSS: 0.02m error

/images/cav_sensor

RSU

One RSU is equipped with the following 5 sensors:

  • RGB camera x 2: 110° FOV, -15° pitch, 800x600 resolution, position: top of the traffic light, pose: forward, backward

  • LiDAR x 1: 120m detection range, 64 channels, 130,0000 points per second, 40° vertical FOV (-40 ~ 0), 20Hz rotation frequency.

  • Semantic LiDAR x 1: same as LiDAR

  • GNSS: 0.02m error

Depending on the shape of traffic lights, sensors will be mounted at different positions. If traffic light stands on road side, sensors are mounted at its top of height of 14 feet. If traffic light hang over the roadway, sensors are mounted at that location of height of 14 feet. Similar to DAIR-V2X, only one traffic lights are selected as RSU.

Notes: All signalized junctions have one and only one RSU.

/images/rsu_sensor


Summary

agents: 410 CAVs, 56 RSUs.

6 categories: car, van, truck, pedestrian, cycle, motorcycle.

period: an episode of 30s after the traffic flow reaches a relative stable state.

communication: By default, the communication range is 70m.

connections: number of total agents in one's communication range (excluding itself). In a frame, #conn of an agent range from 0 to 31, i.e., connecting with 0 to 31 other agents. On average, a CAV/RSU can connect with 9.913/9.276 other agents in a frame when all equipped cars as CAVs.

CAV penetration rate: the percentage of CAVs in all motor vehicles on the road. In each map, part of cars are equipped with sensors to record environmental information. One can select some of them as CAVs to realize various CAV penetration rates. The max penetration rate over maps varys from 55.17% to 86.21%.

distance travelled: 12.698 km for pedestrians, 53.681 km for equipped cars, 117.935 km for all.

Summary of Multi-V2X

Map#CAV#RSU#frame#bbox#rgb#pcdmax connections
Town01501217,711504,52863,54417,71117
Town031001133,3001,016,233126,60033,30025
Town05801528,5001,211,518105,00028,50027
Town0670830,420463,629115,40030,42019
Town0760519,565371,21975,25019,56522
Town10HD50516,500651,84763,00016,50031
Overall41056145,9964,218,974548,934145,99631

Comparison with other datasets

DatasetYearTypeV2XRGB ImagesLiDAR3D boxesClassesLocationsconnections
DAIR-V2X2022RealV2I71k71k1200k10Beijing, China1
V2V4Real2023RealV2V40k20k240k5Ohio, USA1
RCooper2024RealI2I50k30k-10--
OPV2V2022SimV2V132k33k230k1CARLA Town01, 02, 03, 04, 05, 06, 07, 10HD1-6
V2XSet2022SimV2V&I132K33K230K1Same as OPV2V1-4
V2X-Sim2022SimV2V&I283K47K26.6K1CARLA Town03, 04, 051-4
Multi-V2X (ours)2024SimV2V&I549k146k4219k6CARLA Town01, 03, 05, 06, 07, 10HD0-31

Note: the data was counted per agent.


Data Download

Download the data Multi-V2X-fix from OpenDataLab . Update the dataset as the instruction in CoRTSG if you downloaded the non-fix version and would like to use camera's extrinsic.


Benchmarks

For lack of cooperative perception algorithms targeted for high CAV penetration rate, by now, we just conducted experiments on D10%Multi-V2X\mathcal{D}^{\text{Multi-V2X}}_{\text{10\%}}, a V2X dataset with 10% CAV penetration rate and 14932 frames (counted by 48 ego cars).

Cooperative 3D object detection benchmarks on D10%Multi-V2X\mathcal{D}^{\text{Multi-V2X}}_{\text{10\%}}

MethodAP@0.3AP@0.5AP@0.7
No Fusion0.3070.2370.117
Late Fusion0.3460.2700.141
Early Fusion0.5100.4080.235
V2X-ViT0.4400.3500.228
Where2comm0.4520.3480.213

Test with Trained Models

To test the pretrained models on Multi-V2X, first download the model files from OpenDataLab. Download /scripts/results/pr_config_list_15k.json. Then modify the associated config.yaml: replace pr_setting.path with path of pr_config_list_15k.json.

Install OpenCOOD from https://github.com/RadetzkyLi/OpenCOOD, run:

python ${OpenCOOD}/opencood/tools/inference.py --model_dir ${MODEL_DIR} --fusion_method ${FUSION_METHOD} --dataset_format "multi-v2x" --dataset_root ${ROOT_DIR}

Contact

If you have any questiones, feel free to open an issue or contact the author by email.

Citation

If you find our work useful in your research, feel free to give us a cite:

@article{rongsong2024multiv2x,
      title={Multi-V2X: A Large Scale Multi-modal Multi-penetration-rate Dataset for Cooperative Perception}, 
      author={Rongsong Li and Xin Pei},
      year={2024},
      eprint={2409.04980},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2409.04980}, 
}