SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation

June 9, 2022 · View on GitHub

Created by Ziyi Wang, Yongming Rao, Xumin Yu, Jie Zhou, Jiwen Lu

This repository is an official implementation of SemAffiNet (CVPR 2022).

intro

Installation

Prerequisites

Python 3.8
PyTorch 1.8.1
MinkowskiEngine 0.5.4
timm
open3d
cv2, tensorboardX, imageio, SharedArray, scipy, tqdm, h5py

conda create -n semaffinet python=3.8
conda activate semaffinet
conda install pytorch==1.8.1 torchvision==0.9.1 cudatoolkit=10.2

git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
export CXX=g++-7
conda install openblas
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas

pip install timm
pip install open3d
pip install opencv-python
conda install tensorboardX, imageio, sharedarray, plyfile, tqdm

Usage

Data Preparation

Download the official ScanNetV2 dataset.

Prepare ScanNetV2 2D data: Please follow instructions in 3DMV repo.

python prepare_2d_data.py --scannet_path SCANNET_INPUT_PATH --output_path SCANNET_OUTPUT_PATH --export_label_images

Prepare ScanNetV2 3D data:

python dataset/preprocess_3d_scannet.py

Group ScanNetV2 2D views: preprocess 2D data and group multiple views of one scene into several groups. You will need to install pointnet2_ops from PointNet++ PyTorch repo to run the following command:
```
python dataset/pregroup_2d_scannet.py
```
You can also download our processed group results here.

The data is expected to be in the following file structure:

SemAffiNet/
|-- data/
    |-- 2D/
        |-- scene0000_00/
            |-- color/
                |-- 0.jpg
            |-- depth/
                |-- 0.png
            |-- label/
                |-- 0.png
            |-- pose/
                |-- 0.txt
    |-- 3D/
        |-- train/
            |-- scene0000_00_vh_clean_2.pth
        |-- val/
            |-- scene0011_00_vh_clean_2.pth
        |-- test/
            |-- scene0707_00_vh_clean_2.pth
    |-- view_groups/
        |-- view_groups_train.pth
        |-- view_groups_val.pth
        |-- view_groups_test.pth

Init model preparation

Download the pre-trained resnet34d weights and place it in the initmodel folder. The pre-trained weight is from the timm repository.

Train

ScanNetV2 5cm voxelization setting:

bash tool/train.sh SemAffiNet_5cm config/scannet/semaffinet_5cm.yaml scannet 2

ScanNetV2 2cm voxelization setting:

bash tool/train.sh SemAffiNet_2cm config/scannet/semaffinet_2cm.yaml scannet 2

Test

ScanNetV2 5cm voxelization setting:

bash tool/test.sh SemAffiNet_5cm config/scannet/semaffinet_5cm.yaml scannet 2

ScanNetV2 2cm voxelization setting:

bash tool/test.sh SemAffiNet_2cm config/scannet/semaffinet_2cm.yaml scannet 2

Results

We provide pre-trained SemAffiNet models:

Dataset	URL	3D mIoU	2D mIoU
ScanNetV2 5cm	Google Drive	72.1	68.2
ScanNetV2 2cm	Google Drive	74.5	74.2

Please rename the checkpoints as model_best.pth.tar and organize the directory as the following structure:

    SemAffiNet/
    |-- initmodel/
        |-- resnet34d_ra2-f8dcfcaf.pth
    |-- Exp/
        |-- scannet/
            |-- SemAffiNet_2cm/
                |-- model/
                    |-- model_best.pth.tar
            |-- SemAffiNet_5cm/
                |-- model/
                    |-- model_best.pth.tar

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{wang2022semaff,
  title={SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation},
  author={Wang, Ziyi and Rao, Yongming and Yu, Xumin and Zhou, Jie and Lu, Jiwen},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022}
}

Acknowledgements

Our code is inspired by BPNet. Some of the data preprocessing codes for ScanNetV2 are inspired by 3DMV.