Relational reasoning is a central component of intelligent behavior, but has proven difficult for neural networks to learn. The Relation Network (RN) module was recently proposed by DeepMind to solve such problems, and demonstrated state-of-the-art results on a number of datasets. However, the RN module scales quadratically in the size of the input, since it calculates relationship factors between every patch in the visual field, including those that do not correspond to entities. In this paper, we describe an architecture that enables relationships to be determined from a stream of entities obtained by an attention mechanism over the input field. The model is trained end-to-end, and demonstrates equivalent performance with greater interpretability while requiring only a fraction of the model parameters of the original RN module.
Poster version - Presented at the ViGiL workshop at NIPS 2017 in Long Beach, California
Full Workshop Paper - the NIPS 2017 paper accepted for the ViGiL workshop
@misc{andrews2019relationships, title={Relationships from Entity Stream}, author={Martin Andrews and Sam Witteveen}, year={2019}, eprint={1909.03315}, archivePrefix={arXiv}, primaryClass={cs.CL} }
Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using large-scale unlabelled text analysis. However, these representations typically consist of dense vectors that require a great deal of storage and cause the internal structure of the vector space to be opaque. A more idealizedrepresentation of a vocabulary would be both compact and readily interpretable. With this goal, this paper first shows that Lloyd's algorithm can compress the standard dense vector representation by a factor of 10 without much loss in performance. Then, using that compressed size as a storage budget, we describe a new GPU-friendly factorization procedure to obtain a representation which gains interpretability as a side-effect of being sparse and non-negative in each encoding dimension. Word similarity and word-analogy tests are used to demonstrate the effectiveness of the compressed representations obtained.
Poster version - Presented at ICONIP-2016 in Kyoto, Japan
Full Paper - in ICONIP-2016 proceedings
@Inbook{Andrews2016-CompressingWordEmbeddings, author="Andrews, Martin", editor="Hirose, Akira and Ozawa, Seiichi and Doya, Kenji and Ikeda, Kazushi and Lee, Minho and Liu, Derong", title="Compressing Word Embeddings", bookTitle="Neural Information Processing: 23rd International Conference, ICONIP 2016, Kyoto, Japan, October 16--21, 2016, Proceedings, Part IV", year="2016", publisher="Springer International Publishing", address="Cham", pages="413--422", isbn="978-3-319-46681-1", doi="10.1007/978-3-319-46681-1_50", url="http://dx.doi.org/10.1007/978-3-319-46681-1_50" }
Named Entity Recognition (NER) is a foundational technology for systems designed to process Natural Language documents. However, many existing state-of-the-art systems are difficult to integrate into commercial settings (due their monolithic construction, licensing constraints, or need for corpuses, for example). In this work, a new NER system is described that uses the output of existing systems over large corpuses as its training set, ultimately enabling labelling with (i) better F1 scores; (ii) higher labelling speeds; and (iii) no further dependence on the external software.
'Lite' version - Poster Prize winner at Nvidia ASEAN GPU conference
Full Paper - Presented at IES-2015 in Bangkok, Thailand
@Inbook{Andrews2016-NERfromExperts, author="Andrews, Martin", editor="Lavangnananda, Kittichai and Phon-Amnuaisuk, Somnuk and Engchuan, Worrawat and Chan, Jonathan H.", title="Named Entity Recognition Through Learning from Experts", bookTitle="Intelligent and Evolutionary Systems: The 19th Asia Paciļ¬c Symposium, IES 2015, Bangkok, Thailand, November 2015, Proceedings", year="2016", publisher="Springer International Publishing", address="Cham", pages="281--292", isbn="978-3-319-27000-5", doi="10.1007/978-3-319-27000-5_23", url="http://dx.doi.org/10.1007/978-3-319-27000-5_23" }