arXiv:1410.8251 Abstract | arXiv Analytics

arXiv:1410.8251 [cs.LG]Abstract References Reviews Resources

Notes on Noise Contrastive Estimation and Negative Sampling

Published 2014-10-30Version 1

Estimating the parameters of probabilistic models of language such as maxent models and probabilistic neural models is computationally difficult since it involves evaluating partition functions by summing over an entire vocabulary, which may be millions of word types in size. Two closely related strategies---noise contrastive estimation (Mnih and Teh, 2012; Mnih and Kavukcuoglu, 2013; Vaswani et al., 2013) and negative sampling (Mikolov et al., 2012; Goldberg and Levy, 2014)---have emerged as popular solutions to this computational problem, but some confusion remains as to which is more appropriate and when. This document explicates their relationships to each other and to other estimation techniques. The analysis shows that, although they are superficially similar, NCE is a general parameter estimation technique that is asymptotically unbiased, while negative sampling is best understood as a family of binary classification models that are useful for learning word representations but not as a general-purpose estimator.

Comments: 4 pages

Categories: cs.LG

Keywords: negative sampling, related strategies-noise contrastive estimation, general parameter estimation technique, probabilistic neural models, binary classification models

Related articles: Most relevant | Search more

arXiv:1910.02760 [cs.LG] (Published 2019-10-07)

Negative Sampling in Variational Autoencoders

Adrián Csiszárik, Beatrix Benkő, Dániel Varga

arXiv:2303.17475 [cs.LG] (Published 2023-03-30)

Efficient distributed representations beyond negative sampling

Lorenzo Dall'Amico, Enrico Maria Belliardo

arXiv:2206.11549 [cs.LG] (Published 2022-06-23)

Rethinking Collaborative Metric Learning: Toward an Efficient Alternative without Negative Sampling