- Machine Intelligence / Startups / Finance
- Moved from NYC to Singapore in Sep-2013
- 2014 = 'fun' :
- Machine Learning, Deep Learning, NLP
- Robots, drones
- Since 2015 = 'serious' :: NLP + deep learning
- Intro to tools :
- Dense, CNN, RNN, Embedding
- Goal : Captioning
- Sequence Learning
- Embedding choice?
- Model choice!
- Demo (with voice-over)
Change weights to change output function
Layers of neurons combine and
can form more complex functions
- Goal : Predict Output for a given Input
- Train using known Input and Output data
- The blame game (aka Gradient Descent)
- Deep networks 'create' features
- Pixels in an images are 'organised'
- Idea : Use whole image as feature
- Update parameters of 'Photoshop filters'
- Mathematical term : 'convolution kernel'
- CNN = Convolutional Neural Network
Variable-length input doesn't "fit"
- Run network for each timestep
- ... with the same parameters
- But 'pass along' internal state
- This state is 'hidden depth'
- ... and should learn features that are useful
- ... because everything is differentiable
Gated Recurrent Units
- Major advance in ~ 2013
- Words that are nearby in the text should have similar representations
- Assign a vector (~300d) to each word
- Slide a 'window' over the text (1Bn words?)
- Word vectors are nudged around to minimise surprise
- Keep iterating until 'good enough'
- The vector-space of words self-organizes...
Embedding in a Picture
Image → Caption
- large brown dog running away from the sprinkler in the grass .
- a brown dog chases the water from a sprinkler on a lawn .
- a brown dog running on a lawn near a garden hose
- a brown dog plays with the hose .
- a dog is playing with a hose .
Data Set : Flickr30k
- Summary statistics :
- 31,783 images
- 158,915 human-created captions
- Attribution-style licensing :
- P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image description to visual denotations: New similarity metrics for semantic inference over event descriptions
- Word-by-word (Test)
- Teacher forcing (Training)
- Embedding choices
Basic Layout : Test Time
- Word Vector
- One-Hot embedding
- Use each word's numeric index
- Fixed dimension, independent of vocab size
- Stop words may be 'murky'
- Action words need definitions
- Often used as input stage
- Vocab ~7k ⇒
vector.len == 7000
- Very high number of 0/1 inputs
- Often used as output stage
idx = ArgMax( Softmax() )
- Low dimensionality
- 14 binary digits for 7k vocabulary
- Difficult to believe it works
- Add resilience using ECC
- Action+Stop words : 141-d
- Word Embedding : 50-d
- Concatenate them
- Use as input stage
- Dilated CNNs
- Residual connections
- Gated Linear Units
- Fishing Nets
- Attention-is-all-you-need Layer
- Fix activation/parameter explosion problems
- New Layer that learns scaling parameters :
- Squash layer to
- In Keras :
- Newer ideas : LayerNorm (not in Keras)
Introduced by Microsoft in their
Skip connections now very common
See the very recent
Network Picture 1
'Standard' GRU set-up
Network Picture 2
Dilated CNN set-up (many variants)
Network Picture 3
Facebook CNN set-up (radically simplified)
Network Picture 4
Attention is All you Need (Google, T+7 days)
Image → Caption :
- cables burning gracefully pin shine spoons arrange marshy solar board briefs claps tickets survey disinterested tractor looked movies guns rows engine technical town plaza fat captain paddlers historic motorcyclist soccer scales arabian
- does crown items bug pause ink what kayakers ohio lettering bikes battle squeezing person clad
- Input is 141d one-hot + 50d embedding
- Output is ~7,000 softmax one-hot
- Internal width ~200 units
- No special learning rate adjustments
- 50 epochs take ~ 3.5 hrs
Image → Caption : GRUs
- a black dog running on a park .
- two big dogs play ball across the grass .
- the dog is being blocked by three other men each of it to something .
- a dog chases a ball while a man in a vest holding the hand .
- a man and a dog are chasing with a frisbee in the grass .
Results : Dilated CNN
- the brown dog is standing on a yard .
- one dog bites another baseball player has found behind in the background .
- a dog running in a field leaps onto a field .
- two brown dogs are playing with a ball at a park .
- a brown dog runs his white dog while he is running along in winter grass .
- a gray dog is running on a grass field .
- a dog jumping off over a bush .
- a dog on a leash is near a fountain .
- a brown dog is running through the muddy rain .
- a one dog with a brown jacket is playing in an enclosed setting .
Results : AIAYN
- two dogs play in the grass .
- two dogs race by the two dogs fight to a grassy yard .
- the brown dogs lead beside two fire .
- two colored dog on a dogs to a metal tunnel .
- one dog chases after a brown dog on the park .
- This session was more challenging
- Lots of innovation in NLP
- Having a GPU is VERY helpful
* Please add a star... *
Deep Learning : 1-day Intro
- Level : Beginner+
- Date : 24-June-2017
- Basic plan :
- 9:30am-4pm+ on a Saturday
- Play with real models
- Ask questions 1-on-1
- Get inspired
Cost: S$15 (lunch included) FULL
8-week Deep Learning
- July - Sept (catch-up during August)
- Weekly 3-hour sessions will include :
- Projects : 3 structured & 2 self-directed
- More information : http://RedCatLabs.com/course
- Expect to work hard...