Point Transformers (PyTorch)
June 13, 2026 · View on GitHub
A PyTorch implementation and fair comparison of three transformer architectures for point clouds:
- Point Transformer — Hengshuang Zhao et al.
- PCT: Point Cloud Transformer — Meng-Hao Guo et al.
- Point Transformer — Nico Engel et al.
All three models are implemented behind a common training pipeline (same data, same augmentation, same schedule), so their results can be compared under one consistent setting. Configuration is managed with Hydra, so switching models or hyperparameters is a one-flag change.
Project Structure
config/
├── cls.yaml # classification hyperparameters
├── partseg.yaml # part segmentation hyperparameters
└── model/ # per-model configs: Hengshuang / Menghao / Nico
models/
├── Hengshuang/ # Point Transformer (vector attention + transition down/up)
├── Menghao/ # PCT: Point Cloud Transformer
└── Nico/ # Point Transformer (SortNet + local-global attention)
train_cls.py # ModelNet40 classification training/eval
train_partseg.py # ShapeNet part segmentation training/eval
dataset.py # ModelNet40 / ShapeNetPart data loaders
provider.py # point cloud augmentations
Installation
pip install -r requirements.txt
Requires PyTorch with CUDA; the training scripts assume a GPU.
Classification (ModelNet40)
Data
Download the resampled, aligned ModelNet40 (modelnet40_normal_resampled.zip) and extract it to modelnet40_normal_resampled/ at the repo root.
Train
# default model is set in config/cls.yaml
python train_cls.py
# or pick a model explicitly
python train_cls.py model=Hengshuang
python train_cls.py model=Menghao
python train_cls.py model=Nico
# sweep all three with Hydra multirun
python train_cls.py model=Hengshuang,Menghao,Nico -m
Logs and the best checkpoint (best_model.pth) are written to log/cls/<model>/.
Results
Adam, learning rate decay 0.3 every 50 epochs, 200 epochs total; data augmentation follows Pointnet_Pointnet2_pytorch. For Hengshuang and Nico the initial LR is 1e-3 (these hyperparameters could likely be tuned further); for Menghao it is 1e-4, as suggested by the author.
ModelNet40 classification accuracy (instance average):
| Model | Accuracy |
|---|---|
| Hengshuang | 91.7 |
| Menghao | 92.6 |
| Nico | 85.5 |
Part Segmentation (ShapeNetPart)
Data
Download the aligned ShapeNetPart benchmark (shapenetcore_partanno_segmentation_benchmark_v0_normal.zip) and extract it to data/shapenetcore_partanno_segmentation_benchmark_v0_normal/.
Train
python train_partseg.py model=Hengshuang
Logs and checkpoints are written to log/partseg/<model>/. Currently only Hengshuang's architecture has a segmentation head implemented.
Test & visualize
After training, evaluate the saved checkpoint and export colored point clouds:
python test_partseg.py model=Hengshuang # evaluate + export 20 shapes
python test_partseg.py model=Hengshuang num_visual=50
This reports accuracy / class-avg mIoU / instance-avg mIoU on the test split and, for each exported shape, writes two ASCII .ply files to log/partseg/<model>/visual/:
<idx>_<category>_pred.ply— predicted part labels (colored)<idx>_<category>_gt.ply— ground-truth part labels (colored)
Open them in any point cloud viewer. For example, with Open3D:
import open3d as o3d
o3d.visualization.draw_geometries([o3d.io.read_point_cloud("log/partseg/Hengshuang/visual/0_Airplane_pred.ply")])
The script runs on GPU if available and otherwise falls back to CPU.
License
MIT — see LICENSE.
Acknowledgements
- Training pipeline and data augmentation adapted from Pointnet_Pointnet2_pytorch.
- The Menghao (PCT) implementation is adapted from the author's Jittor version: MenghaoGuo/PCT.