README.md
March 23, 2026 · View on GitHub
New work: Check out When the City Teaches the Car — label-free 3D perception from infrastructure!
Learning 3D Perception from Others' Predictions
A label-efficient framework for 3D detection using expert predictions
ICLR 2025 · DriveX @ ICCV 2025 (Oral)
Jinsu Yoo1, Zhenyang Feng1, Tai-Yu Pan1, Yihong Sun2, Cheng Perng Phoo2, Xiangyu Chen2, Mark Campbell2, Kilian Q. Weinberger2, Bharath Hariharan2, Wei-Lun Chao1
1The Ohio State University 2Cornell University
🔍 Overview
Can an autonomous vehicle learn 3D perception by observing predictions from a nearby expert agent — without accessing its raw sensor data or model weights? R&B-POP answers yes, but shows that naively using expert predictions as pseudo-labels yields poor performance due to two fundamental challenges:
- Mislocalization: GPS inaccuracies and timing delays introduce positional error, causing pseudo-labels to be offset from the true object locations.
- Viewpoint mismatch: Objects visible to the expert may be occluded or outside the ego vehicle's field of view, resulting in false positives and missed detections.
R&B-POP addresses these challenges with a two-stage self-training pipeline on V2V4Real, a real-world collaborative driving dataset. A lightweight PointNet-based box ranker — trained with fewer than 1% labeled frames (~40 frames total) — refines and filters the noisy pseudo-labels before training the ego detector. A distance-based curriculum further improves training by first focusing on nearby objects (where pseudo-labels are more reliable) before gradually expanding to longer ranges. The pipeline iterates: refined labels train a better detector, which in turn generates cleaner pseudo-labels for the next stage.
🛠️ Installation
1. Create conda environment
conda create -n rnb-pop python=3.8 -y
conda activate rnb-pop
2. Install PyTorch
Install PyTorch matching your CUDA version. The codebase was developed with CUDA 11.8:
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --index-url https://download.pytorch.org/whl/cu118
3. Install spconv
spconv is required for the voxelization backbone. Install the version matching your CUDA:
pip install spconv-cu118
4. Install remaining dependencies
git clone https://github.com/jinsuyoo/rnb-pop.git
cd rnb-pop
pip install -r requirements.txt
5. Install the package (editable mode)
pip install -e .
This makes opencood, ranker, and tools importable from anywhere within the project.
6. Build CUDA/Cython extensions
Note: On systems that default to Intel compilers (e.g., OSC), prepend
CC=gccto avoid linker errors with Intel-specific symbols.
# Cython extension for 2D box overlap computation
CC=gcc python opencood/utils/setup.py build_ext --inplace
# CUDA extension for 3D IoU / NMS
cd opencood/utils/iou3d_nms
CC=gcc python setup.py build_ext --inplace
cd ../../..
📦 Dataset: V2V4Real
R&B-POP is evaluated on V2V4Real, a real-world collaborative driving dataset with a Tesla (ego car, car_id=0) and a Honda (reference car, car_id=1) driving within 100m of each other.
- Download the dataset from the V2V4Real website.
- Set
data_pathinconfigs/rnb_pop_v2v4real.yamlto your dataset root.
🤖 Pretrained Models
| Model | File | Description |
|---|---|---|
| Ego Car Detector | pretrained_models/ego_detector.pth | R&B-POP trained detector |
| Box Ranker | pretrained_models/ranker.pth | Trained with 2 annotated frames per scenario (~40 frames total) |
| Reference Car Detector | pretrained_models/refcar_detector.pth | PointPillars (32-beam) trained on reference car LiDAR |
🚀 Usage
Set your dataset root once and reuse it throughout:
DATA_DIR=/path/to/v2v4real # <-- set this to your V2V4Real dataset root
The pipeline consists of the following steps:
[Step 1] Generate ranker training data (skip if using pretrained)
↓
[Step 2] Train box ranker (skip if using pretrained)
↓
[Step 3] Run R&B-POP pipeline (2-stage self-training)
↓
[Step 4] Evaluate
Initial Pseudo-Labels
We provide preprocessed initial pseudo-labels (reference car predictions projected into the ego frame, z-adjusted, FP-filtered) as exp/refcar_predictions_preprocessed.tar.gz (~1 MB), included in this repository. After cloning, simply extract:
tar -xzf exp/refcar_predictions_preprocessed.tar.gz -C exp/
The initial_label_path in the pipeline scripts is already set to exp/refcar_predictions_preprocessed.
The pretrained reference car detector checkpoint (pretrained_models/refcar_detector.pth) is also provided in case you want to regenerate the predictions yourself.
Step 1: Generate Ranker Training Data (optional)
Skip this step if using the pretrained ranker (pretrained_models/ranker.pth).
Ground Plane Estimation
The box ranker uses per-frame above-ground point masks stored under above_ground_ransac/ inside the dataset root. Since the ranker is trained on ego car data only, generate them for the ego car:
python data_preprocessing/generate_ground_plane.py \
--root_dir $DATA_DIR \
--train_split subset2 \
--car_id 0
Generate training data
python ranker/generate_data/generate_ranker_data.py \
--root_dir $DATA_DIR \
--num_annotate_frames 2 \
--num_samples_per_box 1000 \
--save_dir exp/ranker_training_data
This uses only the first 2 annotated frames per scenario (~40 labeled frames total across ~20 scenarios).
Step 2: Train the Box Ranker (optional)
Skip this step if using the pretrained ranker (pretrained_models/ranker.pth).
bash scripts/train_ranker.sh
Or manually:
python ranker/train_ranker.py \
--root_dir $DATA_DIR \
--train_data_dir exp/ranker_training_data \
--num_annotate_frames 2 \
--batch_size 256 \
--epoch 100 \
--save_dir exp/ranker \
--use_offset \
--random_drop_points \
--no_dist
Step 3: Run the Full R&B-POP Pipeline
Edit the path variables at the top of the script:
root_dir=$DATA_DIR
ranker_path="pretrained_models/ranker.pth" # pretrained; or exp/ranker/... if trained from scratch
``$
\text{Then} \text{run}:
**\text{SLURM} (4 \text{nodes} \times 2 \text{GPU}):**
$``bash
sbatch scripts/run_rnb_pop.sh
Local (multi-GPU, single machine):
bash scripts/run_rnb_pop_local.sh
Set ngpus at the top of the script to match the number of GPUs on your machine.
The pipeline runs two stages automatically:
- Stage 1 (Rank & Build): Refine reference car labels → filter by ranker score → train ego detector on 0–40m frames
- Stage 2 (Self-training): Generate new pseudo-labels with trained detector → refine → filter → train on all frames (0–90m)
Output structure under exp/:
exp/
├── stage_1_1_refined/ # ranker-refined labels
├── stage_1_2_filtered/ # score-filtered labels
├── stage_1_3_trained/ # pseudo-labels from stage 1 detector
├── stage_2_1_refined/
├── stage_2_2_filtered/
└── checkpoints/
├── stage_1/ # detector checkpoints from stage 1
└── stage_2/ # detector checkpoints from stage 2
Step 4: Evaluate
To evaluate the pretrained ego car detector:
python test.py \
--model_dir pretrained_models \
--strict_model_path pretrained_models/ego_detector.pth \
--data_split test
Note: Exact numbers may vary slightly depending on the environment (CUDA version, hardware, etc.), though the differences are not meaningful.
To evaluate a detector trained from scratch via the pipeline:
python test.py \
--model_dir exp/checkpoints \
--strict_model_path exp/checkpoints/stage_2/net_epoch060.pth \
--data_split test
To evaluate pseudo-label quality against ego car GT:
exp/ego_gt_labels.tar.gz (~1.2 MB) is included in the repo. After cloning, simply extract:
tar -xzf exp/ego_gt_labels.tar.gz -C exp/
Then run:
python eval_label_quality.py \
--root_dir $DATA_DIR \
--gt_label_path exp/ego_gt_labels \
--gt_label_idx ego \
--pseudo_label_path exp/stage_1_2_filtered \
--pseudo_label_idx pred
📝 Citation
@inproceedings{yoo2025rnbpop,
title={Learning 3D Perception from Others' Predictions},
author={Yoo, Jinsu and Feng, Zhenyang and Pan, Tai-Yu and Sun, Yihong and Phoo, Cheng Perng and Chen, Xiangyu and Campbell, Mark and Weinberger, Kilian Q and Hariharan, Bharath and Chao, Wei-Lun},
booktitle={International Conference on Learning Representations (ICLR)},
year={2025}
}
🙏 Acknowledgments
This codebase builds on OpenCOOD, V2V4Real, and pointnet.pytorch. We thank the authors for their open-source contributions.