arXiv:2007.02837 Abstract | arXiv Analytics

arXiv:2007.02837 [stat.ML]Abstract References Reviews Resources

Does imputation matter? Benchmark for predictive models

Published 2020-07-06Version 1

Incomplete data are common in practical applications. Most predictive machine learning models do not handle missing values so they require some preprocessing. Although many algorithms are used for data imputation, we do not understand the impact of the different methods on the predictive models' performance. This paper is first that systematically evaluates the empirical effectiveness of data imputation algorithms for predictive models. The main contributions are (1) the recommendation of a general method for empirical benchmarking based on real-life classification tasks and the (2) comparative analysis of different imputation methods for a collection of data sets and a collection of ML algorithms.

Categories: stat.ML, cs.LG

Keywords: predictive models, imputation matter, real-life classification tasks, data imputation algorithms, data sets

Related articles: Most relevant | Search more

arXiv:2106.08105 [stat.ML] (Published 2021-06-15)

Employing an Adjusted Stability Measure for Multi-Criteria Model Fitting on Data Sets with Similar Features

Andrea Bommert, Jörg Rahnenführer, Michel Lang

arXiv:2412.11308 [stat.ML] (Published 2024-12-15)

datadriftR: An R Package for Concept Drift Detection in Predictive Models

Ugur Dar, Mustafa Cavus

arXiv:1905.02515 [stat.ML] (Published 2019-05-07)

Guided Visual Exploration of Relations in Data Sets

Kai Puolamäki, Emilia Oikarinen, Andreas Henelius