# NeurIPS Lightning Talks

#### TensorFlow & Deep Learning SG

martin @ reddragon.ai

26 February 2019

• Machine Intelligence / Startups / Finance
• Moved from NYC to Singapore in Sep-2013
• 2014 = 'fun' :
• Machine Learning, Deep Learning, NLP
• Robots, drones
• Since 2015 = 'serious' :: NLP + deep learning
• & GDE ML; TF&DL co-organiser
• & Papers...
• & Dev Course...

• Google Partner : Deep Learning Consulting & Prototyping
• SGInnovate/Govt : Education / Training
• Products :
• Conversational Computing
• Natural Voice Generation - multiple languages
• Knowledgebase interaction & reasoning

## Outline

• whoami = DONE
• The Talks :
• Neural ODEs
• Image correspondences
• Learning ImageNet layer-by-layer
• Wrap-up

## Neural ODEs

• Mathematicians coming to DL
• Very different way of looking at NNs
• Co-Winner of NeurIPS 2019 Best Paper

## The Paper

Showing poster : David Duvenaud * (1 + ε)

## Foundation

• ResNets are common
• Each hidden layer is :
• a function of the previous one; PLUS
• a direct copy of the previous one
• For each layer : output = layer(input) + input
• In mathematics : $$h_{t+1} = f(h_t, \theta_t) + h_t$$

## The Idea

• $$h_{t+1} = f(h_t, \theta_t) + h_t$$
• $$h_{t+1} - h_t = f(h_t, \theta_t)$$
• $$h_{t+\delta} - h_t = f(h_t, \theta_t).\delta$$ # Step a fraction of a layer
• $${{dh_{t}}\over{dt}} = f(h_t, \theta_t, t)$$
• Suddenly, we have a Differential Equation!

## So What?

• Differential Equations have :
• been studied for centuries
• well understood behaviours
• super-efficient solvers

## Still looks impractical...

• But we can train the parameters $$\theta_t$$ ...
• to optimise our Loss function $$L()$$...
• by finding the gradients (as usual) ...
• ... using the adjoint sensitivity method (1962) !
• We already have nice grad() machinery, and modern ODE solvers

## In a nutshell

• The resulting algorithm is memory and time efficient
• Can explicitly trade off accuracy for speed

## Possibilities

• Moving to 'continuous layers' lets us :
• Do an RNN at irregular time intervals
• Cope with missing data easily
• Create Normalising flows (~ inverting a NN)

## Summary

• Illustrates how Mathematicians "Think Different"
• ... and opens up new possibilities
• Code on GitHub

## Image correspondences

• One 'standardly impressive' paper
• One 'crazy impressive' paper

## Model in a Picture

• Losses for finding points (based on ground-truth), and being geometrically consistent

## Model in a Picture

• Amazing thing : Weakly supervised training

## Weak Supervision

• Under-sold (IMHO) in the paper itself
• The training was only supervised via :
• This is a cat : This is another cat
• This is a cat : This is not a cat
• Learn to map the cat keypoints
• With this 'weak supervision', model still learns

## Summary

• Excellent techniques shown at NeurIPS ...
• ... being surpassed by crazier techniques
• Which also open up new possibilities

## Learning ImageNet layer-by-layer

• This shouldn't be possible
• Contradicts lots of accepted wisdom
• Lots of avenues for research

## Model in a Picture

• Freeze weights when moving on to next layer

## Training Accuracy

• Even 1-layer ImageNet is beneficial ...

## Lots of Ideas

• Full-model training not essential
• This procedure :
• Does not use (much) more computation (can cache results)
• Proves that a bad brain can be improved layer-wise
• Could allow 'compression' as the model is built
• Still early days for the implications, though

## Summary

• Still areas ripe for research
• Question everything ...

## Wrap-up

• NeurIPS was in Montréal, in December
• Already there is new stuff coming along
• Looking forwards to more in 2019!

## Deep LearningDeveloper Course

• Module #1 : JumpStart (see previous slide)
• Each 'module' will include :
• In-depth instruction, by practitioners
• Individual Projects
• 70%-100% funding via IMDA for SG/PR
• Stay informed : http://bit.ly/rdai-courses-2019
• Location : SGInnovate/BASH

## RedDragon AIIntern Hunt

• Opportunity to do Deep Learning all day
• Work on something cutting-edge
• Location : Singapore
• Status : SG/PR FTW
• Need to coordinate timing...

## Conversational AI & NLPMeetUp

• http://bit.ly/convaisg
• Next Meeting : Date TBA, hosted at TBD
• Typical Contents :
• Application-centric talks
• Talks with technical content
• Lightning Talks
• Target : >2 Members !!

# - QUESTIONS -

### Martin @ RedDragon . AI

My blog : http://blog.mdda.net/

GitHub : mdda