  • whoami = DONE
  • Word Embeddings
  • Other Sequence Embeddings
  • Other Object Embeddings
  • Multi-Object Embeddings
  • Wrap-up

Word Embeddings

  • Major advance in ~ 2013
  • Words that are nearby in the text should have similar representations
  • Assign a vector (~300-d) to each word
    • Slide a 'window' over the text (1Bn words?)
    • Word vectors are nudged around to minimise surprise
    • Keep iterating until 'good enough'
  • The vector-space of words self-organizes...

Word Embedding

Moving a window

Word Embedding

  • Example 'neighbourhoods' in 300-d space
Word Similarity

(eg: word2vec or GloVe)

Embedding Visualisation

TensorBoard Embeddings

TensorBoard FTW!

Embedding Geometry

Embedding Geometry

Not clear why this works...

Sequence Embeddings

  • Assign a vector (~50-d, initially random) to each token
    • Slide a 'window' over each sequence of tokens
    • Token vectors are nudged around to minimise surprise
    • Keep iterating until 'good enough'
  • The vector-space of tokens self-organizes


Customer story

Something non-text?

Graphs are everywhere

Graphs are everywhere!

Graph Embedding

  • Assign a vector (~50-d, initially random) to each node
    • Generate random paths along edges
    • Do Embedding on manufactured sequences
    • Keep iterating until 'good enough'
  • Node representations self-organize

Graph Embedding


(eg: node2vec)


How about images?

Image through CNN

Processing an Image

Embedding Images

  • Calculate the representation from a pretrained CNN ...
    • Just ignore the last layer(s)
    • Take the output, and normalise it a bit
    • ... this probably works straight away

MNIST embedding

MNIST embedding

Clustering is same thing as neighbourhoods...


Image search example

Image search

Multiple things?

Images with tags

Images with tags

Embedding Two Types...

  • Add 'transformer matrices' :
    • 'P' matrix projects CNN representation of image
    • 'Q' matrix projects word embedding
    • ... fix up P and Q to align them
  • 'Latent Space' self organises...

Common Embedding Space

Common Latent Space

Both 'modalities' map into same 'space'


Carousell latent space

Title auto-suggestions

Includes 'Geometry'...

Carousell geometry

Once again, magical relationships appear!


  • Word/NLP embeddings are fundamental
  • Embeddings apply far beyond words
  • Embed All The Things!
