arXiv:1701.02960 Abstract | arXiv Analytics

arXiv:1701.02960 [cs.LG]Abstract References Reviews Resources

Slow mixing for Latent Dirichlet allocation

Published 2017-01-11Version 1

Markov chain Monte Carlo (MCMC) algorithms are ubiquitous in probability theory in general and in machine learning in particular. A Markov chain is devised so that its stationary distribution is some probability distribution of interest. Then one samples from the given distribution by running the Markov chain for a "long time" until it appears to be stationary and then collects the sample. However these chains are often very complex and there are no theoretical guarantees that stationarity is actually reached. In this paper we study the Gibbs sampler of the posterior distribution of a very simple case of Latent Dirichlet Allocation, an attractive Bayesian unsupervised learning model for text generation and text classification. It turns out that in some situations, the mixing time of the Gibbs sampler is exponential in the length of documents and so it is practically impossible to properly sample from the posterior when documents are sufficiently long.

Comments: 9 pages

Categories: cs.LG, stat.ML

Subjects: G.3

Keywords: latent dirichlet allocation, bayesian unsupervised learning model, slow mixing, gibbs sampler, markov chain monte carlo

Related articles: Most relevant | Search more

arXiv:2111.01480 [cs.LG] (Published 2021-11-02, updated 2022-08-25)

A derivation of variational message passing (VMP) for latent Dirichlet allocation (LDA)

Rebecca M. C. Taylor, Dirko Coetsee, Johan A. du Preez

arXiv:1205.1053 [cs.LG] (Published 2012-05-04)

Variable Selection for Latent Dirichlet Allocation

Dongwoo Kim, Yeonseung Chung, Alice Oh

arXiv:2405.20542 [cs.LG] (Published 2024-05-30)