arXiv:2106.07263 Abstract | arXiv Analytics

arXiv:2106.07263 [stat.ML]Abstract References Reviews Resources

Machine Learning for Variance Reduction in Online Experiments

Yongyi Guo, Dominic Coey, Mikael Konutgan, Wenting Li, Chris Schoener, Matt Goldman

Published 2021-06-14Version 1

We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, which we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments the estimator has over 70\% lower variance than the simple difference-in-means estimator, and about 19\% lower variance than the common univariate procedure which adjusts only for pre-experiment values of the outcome.

Categories: stat.ML, cs.LG

Keywords: machine learning, variance reduction, online experiments, regression-adjusted treatment effect estimator, learning regression-adjusted treatment effect

Related articles: Most relevant | Search more

arXiv:1603.03788 [stat.ML] (Published 2016-03-11)

A Primer on the Signature Method in Machine Learning

Ilya Chevyrev, Andrey Kormilitzin

arXiv:1711.10781 [stat.ML] (Published 2017-11-29)

Introduction to Tensor Decompositions and their Applications in Machine Learning

Stephan Rabanser, Oleksandr Shchur, Stephan Günnemann

arXiv:1312.2171 [stat.ML] (Published 2013-12-08, updated 2014-11-24)

bartMachine: Machine Learning with Bayesian Additive Regression Trees

Adam Kapelner, Justin Bleich