arXiv:1501.03326 [stat.ML]AbstractReferencesReviewsResources
Unbiased Bayes for Big Data: Paths of Partial Posteriors
Heiko Strathmann, Dino Sejdinovic, Mark Girolami
Published 2015-01-14Version 1
Bayesian inference proceeds based on expectations of certain functions with respect to the posterior. Markov Chain Monte Carlo is a fundamental tool to compute these expectations. However, its feasibility is being challenged in the era of so called Big Data as all data needs to be processed in every iteration. Realising that such simulation is an unnecessarily hard problem if the goal is estimation, we construct a computationally scalable methodology that allows unbiased estimation of the required expectations -- without explicit simulation from the full posterior. The average computational complexity of our scheme is sub-linear in the size of the dataset and its variance is straightforward to control, leading to algorithms that are provably unbiased and naturally arrive at a desired error tolerance. We demonstrate the utility and generality of the methodology on a range of common statistical models applied to large scale benchmark datasets.