Tips and Tricks

TensorFlow & Deep Learning SG

Martin Andrews @ redcatlabs.com

20 July 2017

About Me

  • Machine Intelligence / Startups / Finance
    • Moved from NYC to Singapore in Sep-2013
  • 2014 = 'fun' :
    • Machine Learning, Deep Learning, NLP
    • Robots, drones
  • Since 2015 = 'serious' :: NLP + deep learning
    • & Papers...

Show-of-Hands

No other obligation...

  • Been here before?
  • Want something easy?
  • Want something advanced?
  • Watch Engineers.sg?
  • Listen to podcasts?
  • Want another Beginners Day?

Outline

  • Beginner / Newbie
  • Day-to-Day user
  • Research...

Beginner / Newbie

  • Starting Out
  • Installation
  • Size vs Structure
  • Sizing & Complexity

Starting Out

  • Lots to watch & read & watch & read & ...
  • Better : Set a deadline, and do something
  • Install → MNIST → Blog+GitHub

Installation

  • Windows vs OSX vs Linux
  • Find newest instructions
  • Upgrade ⇒ breakage

GitHub models

  • Is this the best starting point?
  • Is this using the same versions?
  • Don't underestimate "taste"

Size vs Structure

  • Common concerns :
    • How many layers?
    • How wide?
    • What activation function?
  • Better to look at bigger picture

Sizing

  • Get something that works :
    • Train for ~15mins
    • Should show improvement
  • Make small, but definite changes
  • Prove that it has an effect

Complexity

  • Do a back-of-envelope estimate :
    • 1 training example per parameter
    • Keep images small
    • Can you do the task in <1 sec?
  • Keras : model.summary()
  • Natural Language Processing :
    • Embeddings : 50d should work
    • Vocab >8k is nice-to-have

Day-to-Day

  • Simple → Complex
  • CPU vs GPU
  • Improving model performance
  • Understand limitations
  • Complex → Simple

Simple → Complex

  • Errors very difficult to reason about
  • Make sure model works at every step
  • When in trouble : Make it trivial
  • See code example

CPU vs GPU

  • Experiment cycle speed is key
  • No GPU to play with?
    • If this is 'serious' then spend the money
    • One local GPU, use cloud for more
    • Keras : model.fit_generator()

Improving model
performance

  • DropOut (generalisation via ensembles)
  • BatchNorm
    • or 'cleaner' LayerNorm
    • or simple WeightNorm (parameter-free)
  • Adaptive gradient methods
    • Adam, Adadelta, etc
    • RMSProp is ~BatchNorm for gradients
  • Hyperparameter search? Meh

Understand Limitations

  • Vary network structure
  • Have a hypothesis before you run
  • Explain what went wrong

Complex → Simple

  • When something works super-well
    • Try to destroy it
  • Smaller may be better...
  • Or the NN may be cheating

Research

  • Implement papers
  • Reproducibility
  • Dare to Succeed

Implement Papers

  • À la Andrew Ng
  • Read → Implement → Improve
  • Publish on arXiv & conference

Reproducibility

  • Fix baseline data and model
  • Random Seed(s)
  • Add a layout parameter
  • Don't forget how to program

github.com / mdda / deep-learning-workshop
/notebooks/2-CNN/7-Captioning/
4-run-captioning.ipynb

Dare to Succeed

  • Deep Learning is super-new
  • Need something harder than MNIST...
  • No need to be in MAFIA's shadow

Wrap-up

  • Beginners : Start!
  • Day-to-day : Experimentation
  • Research : Don't be timid
GitHub - mdda

* Please add a star... *

- QUESTIONS -


Martin.Andrews @
RedCatLabs.com


My blog : http://mdda.net/

GitHub : mdda

Deep Learning
MeetUp Group

8-week Deep Learning
Developer Course

  • Sept - Oct+
  • 2x Weekly 3-hour sessions will include :
    • Instruction
    • Projects : 3 structured & 2 self-directed
  • Pre-register : http://RedCatLabs.com/course
  • ( more later )