29 September 2019
whoami
= DONE<MASK>
is a small, typically cute, feline.<MASK>
seventy cat breeds.<MASK>
this talk to <MASK>
interesting."Attention Is All You Need" - Vaswani et al (2017)
Let's focus on the Attention box
Sequential Attention...
Let's focus on the Tokenization+Position box
Tokenization is also learned -
infinite vocabulary in 50k tokens
Sine-waves define position (think : FFT)
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context" - Dai et al (2019)
"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" - Devlin et al (2018)
MASKing tasks are still self-supervised...
Let's focus on the Top boxes
"Language Models are Unsupervised Multitask Learners" - Radford et al (2019)
Even more surprising to the researchers was the fact that
the unicorns spoke perfect English.
Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.
Dr. Jorge Pérez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they ...
"XLNet: Generalized Autoregressive Pretraining for Language Understanding" - Yang et al (2018)
batch_size =
2048hidden_size =
1024max_seq_len =
512n_params
to BERT-large
def embedding_lookup(x, n_token, d_embed, initializer, use_tpu=True,
scope='embedding', reuse=None, dtype=tf.float32):
"""TPU and GPU embedding_lookup function."""
with tf.variable_scope(scope, reuse=reuse):
lookup_table = tf.get_variable('lookup_table', [n_token, d_embed],
dtype=dtype, initializer=initializer)
if use_tpu:
one_hot_idx = tf.one_hot(x, n_token, dtype=dtype)
if one_hot_idx.shape.ndims == 2:
return tf.einsum('in,nd->id', one_hot_idx, lookup_table)
else:
return tf.einsum('ibn,nd->ibd', one_hot_idx, lookup_table)
else:
return tf.nn.embedding_lookup(lookup_table, x)
"RoBERTa: A Robustly Optimized BERT Pretraining Approach" - Liu et al (2019)
"CTRL: A Conditional Transformer Language Model for Controllable Generation" - Keskar et al (2019)
Links
; Wikipedia
; Reviews
; Title
Questions
; Translation
Even more surprising to the researchers was the fact that
the unicorns spoke perfect English.
...
Scientists also confirmed that the Unicorn Genome Project has already identified several genes ...
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations" - Anonymous (2019)
"BERT Rediscovers the Classical NLP Pipeline"
- Tenney et al (2019)
"Transformer to CNN: Label-scarce distillation for
efficient text classification" - Chia et al (2018)
"Parameter-Efficient Transfer Learning for NLP"
- Houlsby et al (2019)
"Scene Graph Parsing by Attention Graph"
- Andrews et al (2018)
MASK
technique :"VideoBERT: A Joint Model for Video and Language Representation Learning"
- Sun et al (2019)