arXiv:2010.05223 Abstract | arXiv Analytics

arXiv:2010.05223 [cs.LG]Abstract References Reviews Resources

End to End Binarized Neural Networks for Text Classification

Harshil Jain, Akshat Agarwal, Kumar Shridhar, Denis Kleyko

Published 2020-10-11Version 1

Deep neural networks have demonstrated their superior performance in almost every Natural Language Processing task, however, their increasing complexity raises concerns. In particular, these networks require high expenses on computational hardware, and training budget is a concern for many. Even for a trained network, the inference phase can be too demanding for resource-constrained devices, thus limiting its applicability. The state-of-the-art transformer models are a vivid example. Simplifying the computations performed by a network is one way of relaxing the complexity requirements. In this paper, we propose an end to end binarized neural network architecture for the intent classification task. In order to fully utilize the potential of end to end binarization, both input representations (vector embeddings of tokens statistics) and the classifier are binarized. We demonstrate the efficiency of such architecture on the intent classification of short texts over three datasets and for text classification with a larger dataset. The proposed architecture achieves comparable to the state-of-the-art results on standard intent classification datasets while utilizing ~ 20-40% lesser memory and training time. Furthermore, the individual components of the architecture, such as binarized vector embeddings of documents or binarized classifiers, can be used separately with not necessarily fully binary architectures.

Comments: 14 pages. Accepted at the SustaiNLP Workshop on Simple and Efficient Natural Language Processing at EMNLP 2020

Categories: cs.LG, cs.CL

Keywords: text classification, vector embeddings, end binarized neural network architecture, standard intent classification datasets, increasing complexity raises concerns

Related articles: Most relevant | Search more

arXiv:1911.11756 [cs.LG] (Published 2019-11-26)

Semi-Supervised Learning for Text Classification by Layer Partitioning

Alexander Hanbo Li, Abhinav Sethy

arXiv:2107.10314 [cs.LG] (Published 2021-07-21)

Small-text: Active Learning for Text Classification in Python

Christopher Schröder, Lydia Müller, Andreas Niekler, Martin Potthast

arXiv:2211.00369 [cs.LG] (Published 2022-11-01)

Anytime Generation of Counterfactual Explanations for Text Classification