conftensorflowtraining_day1

January 22, 2019 ยท View on GitHub

1-day Deep Learning with R workshop at RStudio::conf 2019

Essentials

Target audience

  • Data scientists proficient in R and machine learning.

Goals

  • Introduce data scientists to the fundamentals of deep learning
  • How to implement ANN, CNN and RNN architectures in R using the keras package
  • How to identify and avoid common pitfalls.

Practical Component

  • Hands-on, application of deep learning to solve problems in supervised machine learning (regression and classification using multivariate data, images and text).

Slides

The slides shown during the presentation can be found here.

Data

The data sets can be found online here:

I've also made them available on my server in the same format used in the exercises (train, validation and test folders):

Structure

Session I: Deep Learning Basics

Topics Covered:

  • What is a tensor and why use it?
  • What is keras and what is its relationship to TensorFlow?
  • What is the deep in deep learning? ANNs and densely-connected networks.
  • The math of deep learning: Basics of matrix algebra, gradient descent, backpropagarion, chain rule.
  • The four stages of Deep learning.
  • Parameters and hyper-parameters.
  • Functions distinguishing classification and regression: loss and optimizer functions.

Workshop Dataset:

  • The Boston Housing Price dataset for regression, single-label, multi-class classification and binary classificaiton.

DIY Exercise Datasets:

  • The UCI Abalone data-set, predict ring number as a categorical or continuous variable.

Files

Markdown FileDescription
0_1_Classic_ML.RmdSome context from classical Machine Learning
1_0_Boston_reg.RDeep Learning for Regression, plain R script
1_1_DL_Basics_Regression.RmdDeep Learning for Regression
1_2_DL_Basics_Binary_Classification.RmdDeep Learning for Binary Classification
1_3_DL_Basics_Multi-class_Classification.RmdDeep Learning for Single-label, Multi-class Classification

Session 2: Building better models: Evaluating and Optimizing Models

Topics Covered:

  • The training, validation and test sets.
  • Four ways of dealing with over-fitting: more data, capacity, dropout, regularization.
  • The universal workflow of machine learning.
  • Introduction to tfruns package for evaluating and comparing runs

DIY Exercise:

  • The UCI Abalone data-set, predict ring number as a categorical or continuous variable.
Markdown FileDescription
2_1_Eval-Optim_validation.RmdAppling validation
2_2_Eval-Optim_Overfitting.RmdAvoiding over-fitting
2_3_Eval-Optim_Capacity.RmdChanging capacity
2_4_Eval-Optim_tfruns.RmdUsing the tfruns package

Session 3: Image processing

  • Requirements of computer vision not met with ANNs.
  • New layers: convolution, maximum pooling.
  • Revisiting over-fitting.
  • CNNs as an extension of densely-connected networks.
  • Accessing individual layers of trained models.
  • Using pre-trained models to increase accuracy.
  • Ethics of machine learning: predicting beyond the bounds of the training set.

Workshop Dataset:

  • Dog versus cat images - binary classification.

DIY Exercise:

  • Malaria histology images -- binary classification.
  • The Labradoodle versus fried chicken image dataset.

Files

Markdown FileDescription
3_1_Computer_Vision_Intro.RmdWorking with images
3_2_Computer_Vision_Augmentation.RmdUsing Image Augmentation to reduce over-fitting
3_3_Computer_Vision_Optimization.RmdUsing pre-trained Convnets
3_4_Computer_Vision_Fine-tuning.RmdOptimizing pre-rained Convnets
3_5_Computer_Vision_Visualising.RmdVisualizing layers

Session 4: Text processing

  • Formatting text for neural networks.
  • One-hot encoding vs embedding.
  • RNNs and LSTM compared to ANNs.

Workshop Dataset:

  • Reuters Newswire dataset -- single label, multi-class classification with text.

DIY Exercise:

  • The IMDB movie sentiment dataset -- binary classification

Files

Markdown FileDescription
4_1_Text_Analysis_One-Hot.RmdBasic text analysis using one-hot encoding.
4_2_Text_Analysis_Word-Embeddings.RmdTrainig word embeddings.
4_3_Text_Analysis_pre-trained_embeddings.RmdUsing pre-trained word embeddings.
4_4_Text_Analysis_Simple-RNNs.RmdUnderstanding RNNs.
4_5_Text_Analysis_RNN-on-Reuters.RmdApplying RNNs.
4_6_Text_Analysis_LSTMs.RmdApplying LSTMs.