arXiv:1612.02707 Abstract | arXiv Analytics

arXiv:1612.02707 [cs.LG]Abstract References Reviews Resources

Human powered multiple imputation

Published 2016-12-08Version 1

Missing data is universal and methods to deal with it far ranging from simply ignoring it to using complex modelling strategies such as multiple imputation and maximum likelihood estimation.Missing data has only been effectively imputed by machines via statistical/machine learning models. In this paper we set to answer an important question "Can humans perform reasonably well to fill in missing data, given information about the dataset?". We do so in a crowdsourcing framework, where we first translate our missing data problem to a survey question, which then can be easily completed by crowdworkers. We address challenges that are inherent to crowdsourcing in our context and present the evaluation on a real dataset. We compare human powered multiple imputation outcomes with state-of-the-art model based imputation.

Categories: cs.LG, cs.HC, stat.ML

Keywords: missing data, human powered multiple imputation outcomes, maximum likelihood estimation, complex modelling strategies, important question

Related articles: Most relevant | Search more

arXiv:2405.13977 [cs.LG] (Published 2024-05-22)

Removing Bias from Maximum Likelihood Estimation with Model Autophagy

Paul Mayer, Lorenzo Luzi, Ali Siahkoohi, Don H. Johnson, Richard G. Baraniuk

arXiv:1706.01120 [cs.LG] (Published 2017-06-04)

Evolving imputation strategies for missing data in classification problems with TPOT

Unai Garciarena, Roberto Santana, Alexander Mendiburu

arXiv:1506.02348 [cs.LG] (Published 2015-06-08)

Convergence Rates of Active Learning for Maximum Likelihood Estimation

Kamalika Chaudhuri, Sham Kakade, Praneeth Netrapalli, Sujay Sanghavi