arXiv:1511.01437 [math.PR]AbstractReferencesReviewsResources
The sample size required in importance sampling
Sourav Chatterjee, Persi Diaconis
Published 2015-11-04Version 1
The goal of importance sampling is to estimate the expected value of a given function with respect to a probability measure $\nu$ using a random sample of size $n$ drawn from a different probability measure $\mu$. If the two measures $\mu$ and $\nu$ are nearly singular with respect to each other, which is often the case in practice, the sample size required for accurate estimation is large. In this article it is shown that in a fairly general setting, a sample of size approximately $\exp(D(\nu||\mu))$ is necessary and sufficient for accurate estimation by importance sampling, where $D(\nu||\mu)$ is the Kullback--Leibler divergence of $\mu$ from $\nu$. In particular, the required sample size exhibits a kind of cut-off in the logarithmic scale. The theory is applied to obtain a fairly general formula for the sample size required in importance sampling for exponential families (Gibbs measures). We also show that the standard variance-based diagnostic for convergence of importance sampling is fundamentally problematic. An alternative diagnostic that provably works in certain situations is suggested.