arXiv:1403.3342 Abstract | arXiv Analytics

arXiv:1403.3342 [stat.ML]Abstract References Reviews Resources

The Potential Benefits of Filtering Versus Hyper-Parameter Optimization

Michael R. Smith, Tony Martinez, Christophe Giraud-Carrier

Published 2014-03-13Version 1

The quality of an induced model by a learning algorithm is dependent on the quality of the training data and the hyper-parameters supplied to the learning algorithm. Prior work has shown that improving the quality of the training data (i.e., by removing low quality instances) or tuning the learning algorithm hyper-parameters can significantly improve the quality of an induced model. A comparison of the two methods is lacking though. In this paper, we estimate and compare the potential benefits of filtering and hyper-parameter optimization. Estimating the potential benefit gives an overly optimistic estimate but also empirically demonstrates an approximation of the maximum potential benefit of each method. We find that, while both significantly improve the induced model, improving the quality of the training set has a greater potential effect than hyper-parameter optimization.

Comments: 11 pages, 4 tables, 3 Figures

Categories: stat.ML, cs.LG

Keywords: hyper-parameter optimization, induced model, training data, greater potential effect, maximum potential benefit

Related articles: Most relevant | Search more

arXiv:2005.07939 [stat.ML] (Published 2020-05-16)

Predicting into unknown space? Estimating the area of applicability of spatial prediction models

Hanna Meyer, Edzer Pebesma

arXiv:2202.00622 [stat.ML] (Published 2022-02-01)

Datamodels: Predicting Predictions from Training Data

Andrew Ilyas, Sung Min Park, Logan Engstrom, Guillaume Leclerc, Aleksander Madry

arXiv:1504.03415 [stat.ML] (Published 2015-04-14)

HHCART: An Oblique Decision Tree