tutorial-notes.md

June 6, 2017 · View on GitHub

Keras intro

btw: Karpathy’s course notes: cs231n.github.io
keras
- rising popularity
- tensorflow (apparently) going to have official support for keras https://github.com/fchollet/keras/issues/5050
- philosophy: user friendliness
- build model, then compile
sequential modelling: https://keras.io/getting-started/sequential-model-guide/
- a->b->c->d ; linear NN from input to output
functional modelling: https://keras.io/getting-started/functional-api-guide/
- you can do branching NN models, with multiple inputs and outputs at different places along the branching paths
not sure which activation function to use? —> ReLU (or maybe tanh)
- ReLU = max(0,x) but risk “killing” NN :( but good for sparse representation :)
- tanh is better than sigmoid b/c less saturation around ends and 0 maps to 0
convolve = filter (think image filters, like edge detection) —> = “get features”
pool = “chunk sections of the input”
- (why? to reduce parameters; some data is actually redundant; like humans can work with bad resolution)
- example: 4x4 —(max pool 2x2)—> each quarter —> just one 2x2 output
upsampling = opposite of maxpooling
dropout = “turn off” certain nodes/neurons —> why? so not all learning specific & generalizability & mitigate overfitting
- how? ~ varying “width” of a layer at each epoch

image classifiers with little data:

that website uses kaggle data
data augmentation = create transformations of data already have
- (con: risk overfitting <— high correlation)
- (note: instatiate generator, then actually use it)
but mitgate that overfitting risk <— dropout
80% accuracy already
use pre-trained network (e.g.: VGG16) —> 90% accuracy
- VGG16 has a bunch of pre-trained layers
- add “bottleneck features” selection somewhere before bottom/classification layer to apply to your specific classifier
can do even better than 90% (94%) if you fine-tune top layers of pre-trained network
- freeze layers up to last convolution block (assumed already good)
- fine-tune the last layer closest to the final fully-connected classifier
- recommended: set a slow learning rate (lr) to not wipe out previous learning
can do even even better with other suggested approaches

simplest autoencoder example

autoencoder is kind of like PCA or kind of like zipping files, but is more noisy (kind of like human memory? chunking?)
deep autoencoder = layered autoencoder
- from this link: NOTE TO SELF: TRY OUT TENSORBOARD CALLABLE FROM INTO KERAS VAE = generative model = creates probably likely output, instead of acting like a strict input-output function
- = input —(function)—> likey/probable output

more links:

https://cs231n.github.io/neural-networks-1/