PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

January 18, 2019 ยท View on GitHub

@misc{CV2018,
  author =       {Donny You (youansheng@gmail.com)},
  howpublished = {\url{https://github.com/donnyyou/PyTorchCV}},
  year =         {2018}
}

This repository provides source code for some deep learning based cv problems. We'll do our best to keep this repository up to date. If you do find a problem about this repository, please raise it as an issue. We will fix it immediately.

Implemented Papers

  • Image Classification

    • VGG: Very Deep Convolutional Networks for Large-Scale Image Recognition
    • ResNet: Deep Residual Learning for Image Recognition
    • DenseNet: Densely Connected Convolutional Networks
    • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
    • ShuffleNet V2: Practical Guidelines for Ecient CNN Architecture Design
  • Semantic Segmentation

    • DeepLabV3: Rethinking Atrous Convolution for Semantic Image Segmentation
    • PSPNet: Pyramid Scene Parsing Network
    • DenseASPP: DenseASPP for Semantic Segmentation in Street Scenes
  • Object Detection

    • SSD: Single Shot MultiBox Detector
    • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
    • YOLOv3: An Incremental Improvement
    • FPN: Feature Pyramid Networks for Object Detection
  • Pose Estimation

    • CPM: Convolutional Pose Machines
    • OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
  • Instance Segmentation

    • Mask R-CNN

Performances with PyTorchCV

Image Classification

  • ResNet: Deep Residual Learning for Image Recognition

Semantic Segmentation

  • PSPNet: Pyramid Scene Parsing Network
ModelBackboneTraining dataTesting datamIOUPixel AccSetting
PSPNet Origin3x3-ResNet101ADE20K trainADE20K val41.9680.64-
PSPNet Ours7x7-ResNet101ADE20K trainADE20K val44.1880.91PSPNet

Object Detection

  • SSD: Single Shot MultiBox Detector
ModelBackboneTraining dataTesting datamAPFPSSetting
SSD-300 OriginVGG16VOC07+12 trainvalVOC07 test0.772--
SSD-300 OursVGG16VOC07+12 trainvalVOC07 test0.786-SSD300
SSD-512 OriginVGG16VOC07+12 trainvalVOC07 test0.798--
SSD-512 OursVGG16VOC07+12 trainvalVOC07 test0.808-SSD512
  • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
ModelBackboneTraining dataTesting datamAPFPSSetting
Faster R-CNN OriginVGG16VOC07 trainvalVOC07 test0.699--
Faster R-CNN OursVGG16VOC07 trainvalVOC07 test0.706-Faster R-CNN
  • YOLOv3: An Incremental Improvement

Pose Estimation

  • OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Instance Segmentation

  • Mask R-CNN

Commands with PyTorchCV

Take PSPNet as an example. ("tag" could be any string, include an empty one.)

  • Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
  • Resume Training
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh train tag
  • Validate
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh val tag
  • Testing:
cd scripts/seg/cityscapes/
bash run_fs_pspnet_cityscapes_seg.sh test tag

Examples with PyTorchCV

Example output of VGG19-OpenPose

Example output of VGG19-OpenPose