arXiv Analytics

Sign in

arXiv:1808.03591 [cs.LG]AbstractReferencesReviewsResources

How Complex is your classification problem? A survey on measuring classification complexity

Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, Tin K. Ho

Published 2018-08-10Version 1

Extracting characteristics from the training datasets of classification problems has proven effective in a number of meta-analyses. Among them, measures of classification complexity can estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision boundary are among the existent measures for this characterization. This information can support the formulation of new data-driven pre-processing and pattern recognition techniques, which can in turn be focused on challenging characteristics of the problems. This paper surveys and analyzes measures which can be extracted from the training datasets in order to characterize the complexity of the respective classification problems. Their use in recent literature is also reviewed and discussed, allowing to prospect opportunities for future work in the area. Finally, descriptions are given on an R package named Extended Complexity Library (ECoL) that implements a set of complexity measures and is made publicly available.

Related articles: Most relevant | Search more
arXiv:1711.01744 [cs.LG] (Published 2017-11-06)
KGAN: How to Break The Minimax Game in GAN
arXiv:2308.05903 [cs.LG] (Published 2023-08-11)
Comparing the quality of neural network uncertainty estimates for classification problems
arXiv:2002.12036 [cs.LG] (Published 2020-02-27)
Complexity Measures and Features for Times Series classification