arXiv Analytics

Sign in

arXiv:2106.07263 [stat.ML]AbstractReferencesReviewsResources

Machine Learning for Variance Reduction in Online Experiments

Yongyi Guo, Dominic Coey, Mikael Konutgan, Wenting Li, Chris Schoener, Matt Goldman

Published 2021-06-14Version 1

We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments the estimator has over 70\% lower variance than the simple difference-in-means estimator, and about 19\% lower variance than the common univariate procedure which adjusts only for pre-experiment values of the outcome.

Related articles: Most relevant | Search more
arXiv:1603.03788 [stat.ML] (Published 2016-03-11)
A Primer on the Signature Method in Machine Learning
arXiv:1711.10781 [stat.ML] (Published 2017-11-29)
Introduction to Tensor Decompositions and their Applications in Machine Learning
arXiv:1312.2171 [stat.ML] (Published 2013-12-08, updated 2014-11-24)
bartMachine: Machine Learning with Bayesian Additive Regression Trees