[ECCV 2024] T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning

September 26, 2024 ยท View on GitHub


Website arXiv

๐Ÿ  About

This repository contains the official implementation of the paper T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning by Weijie Wei, Fatemeh Karimi Najadasl, Theo Gevers and Martin R. Oswald.

๐Ÿ”ฅ News

  • [2024/09/19] The code will be released soon.
  • [2024/09/22] Release the code of evaluation on ONCE dataset.
  • [2024/09/25] Training code on ONCE dataset released as well as the pretrained and finetuned weights.

Table of Contents

TODO

  • Release ONCE evaluation code.
  • Release ONCE training code.
  • Release Waymo training code and inference code.

Installation

We test this environment with NVIDIA A100 GPUs and Linux RHEL 8.

conda create -n t-mae python=3.8
conda activate t-mae
conda install -y pytorch==1.11.0 torchvision==0.12.0 torchaudio=0.11.0 cudatoolkit=11.3 -c pytorch
conda install -y -c fvcore -c iopath -c conda-forge fvcore iopath
pip install "git+https://github.com/facebookresearch/pytorch3d.git@v0.7.1"
pip install numpy==1.19.5 protobuf==3.19.4 scikit-image==0.19.2 spconv-cu113 numba scipy pyyaml easydict fire tqdm shapely matplotlib opencv-python addict pyquaternion awscli open4d pandas future pybind11 tensorboardX tensorboard Cython 
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu113.html

pip install pycocotools
pip install SharedArray
pip install tensorflow-gpu==2.5.0
pip install protobuf==3.20

git clone https://github.com/codename1995/T-MAE
cd T-MAE && python setup.py develop --user
cd pcdet/ops/dcn && python setup.py develop --user

Data Preparation

Please follow the instruction of OpenPCDet to prepare the dataset. For the Waymo dataset, we use the evaluation toolkits to evaluate detection results, where the compute_detection_metrics_main file comes from Waymo-open-dataset API (Mar 2023) and its source code is C++ based.

data
โ”‚โ”€โ”€ waymo
โ”‚   โ”‚โ”€โ”€ ImageSets/
โ”‚   โ”‚โ”€โ”€ raw_data
โ”‚   โ”‚   โ”‚โ”€โ”€ segment-xxxxxxxx.tfrecord
โ”‚   โ”‚   โ”‚โ”€โ”€ ...
โ”‚   โ”‚โ”€โ”€ waymo_processed_data
โ”‚   โ”‚   โ”‚โ”€โ”€ segment-xxxxxxxx/
โ”‚   โ”‚   โ”‚โ”€โ”€ ...
โ”‚   โ”‚โ”€โ”€ waymo_processed_data_gt_database_train_sampled_1/
โ”‚   โ”‚โ”€โ”€ waymo_processed_data_waymo_dbinfos_train_sampled_1.pkl
โ”‚   โ”‚โ”€โ”€ waymo_processed_data_infos_test.pkl
โ”‚   โ”‚โ”€โ”€ waymo_processed_data_infos_train.pkl
โ”‚   โ”‚โ”€โ”€ waymo_processed_data_infos_val.pkl
โ”‚   โ”‚โ”€โ”€ compute_detection_metrics_main
โ”‚   โ”‚โ”€โ”€ gt.bin
โ”‚โ”€โ”€ once
โ”‚   โ”‚โ”€โ”€ ImageSets/
โ”‚   โ”‚โ”€โ”€ data
โ”‚   โ”‚   โ”‚โ”€โ”€ 000000/
โ”‚   โ”‚   โ”‚โ”€โ”€ ...
โ”‚   โ”‚โ”€โ”€ gt_database/
โ”‚   โ”‚โ”€โ”€ once_dbinfos_train.pkl
โ”‚   โ”‚โ”€โ”€ once_infos_raw_large.pkl
โ”‚   โ”‚โ”€โ”€ once_infos_raw_medium.pkl
โ”‚   โ”‚โ”€โ”€ once_infos_raw_small.pkl
โ”‚   โ”‚โ”€โ”€ once_infos_train.pkl
โ”‚   โ”‚โ”€โ”€ once_infos_val.pkl
โ”‚โ”€โ”€ ckpts
โ”‚   โ”‚โ”€โ”€ once_tmae_weights.pth
โ”‚   โ”‚โ”€โ”€ ...

Training & Testing

ONCE dataset

# t-mae pretrain & finetune on ONCE
bash scripts/once_train.sh

# Load provided pretrained model and finetune on ONCE
bash scripts/once_finetune_only.sh

# test
bash scripts/once_test.sh

Results

Waymo

Reproduced results to be updated soon. We could not provide the pretrained weights due to Waymo Dataset License Agreement.

ONCE

mAPVehiclePedestrianCyclistWeights
T-MAE (Pretrained)----once_tmae_pretrained.pth
T-MAE (Finetuned)67.4177.5354.8169.90once_tmae_weights.pth

Citation

If you find this repository useful, please consider citing our paper.

@inproceedings{wei2024tmae,
  title={T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning},
  author={Weijie Wei, Fatemeh Karimi Najadasl, Theo Gevers and Martin R. Oswald},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2024}
}

Acknowledgements

This project is mainly based on the following repositories:

We would like to thank the authors for their great work.