Recent methods for learning vector space representations of
words have succeeded in capturing fine-grained semantic and syntactic
regularities using large-scale unlabelled text analysis. However, these representations
typically consist of dense vectors that require a great deal of
storage and cause the internal structure of the vector space to be opaque.
A more idealizedrepresentation of a vocabulary would be both compact and readily interpretable.
With this goal, this paper first shows that Lloyd's algorithm can compress the standard
dense vector representation by a factor of 10 without much loss in performance.
Then, using that compressed size as a storage budget, we describe a new GPU-friendly factorization procedure to obtain a representation
which gains interpretability as a side-effect of being sparse and non-negative in each encoding dimension.
Word similarity and word-analogy tests are used to demonstrate the effectiveness of the compressed representations obtained.
editor="Hirose, Akira and Ozawa, Seiichi and Doya, Kenji
and Ikeda, Kazushi and Lee, Minho and Liu, Derong",
title="Compressing Word Embeddings",
bookTitle="Neural Information Processing:
23rd International Conference, ICONIP 2016, Kyoto, Japan,
October 16--21, 2016, Proceedings, Part IV",
publisher="Springer International Publishing",