arXiv Analytics

Sign in

arXiv:2006.14079 [cs.LG]AbstractReferencesReviewsResources

Ensuring Learning Guarantees on Concept Drift Detection with Statistical Learning Theory

Lucas Pagliosa, Rodrigo Mello

Published 2020-06-24Version 1

Concept Drift (CD) detection intends to continuously identify changes in data stream behaviors, supporting researchers in the study and modeling of real-world phenomena. Motivated by the lack of learning guarantees in current CD algorithms, we decided to take advantage of the Statistical Learning Theory (SLT) to formalize the necessary requirements to ensure probabilistic learning bounds, so drifts would refer to actual changes in data rather than by chance. As discussed along this paper, a set of mathematical assumptions must be held in order to rely on SLT bounds, which are especially controversial in CD scenarios. Based on this issue, we propose a methodology to address those assumptions in CD scenarios and therefore ensure learning guarantees. Complementary, we assessed a set of relevant and known CD algorithms from the literature in light of our methodology. As main contribution, we expect this work to support researchers while designing and evaluating CD algorithms on different domains.

Comments: 10 pages (12 including references), 2 figures
Categories: cs.LG, stat.ML
Subjects: 03C98, I.2.1
Related articles: Most relevant | Search more
arXiv:1910.01064 [cs.LG] (Published 2019-10-02)
Concept Drift Detection and Adaptation with Weak Supervision on Streaming Unlabeled Data
arXiv:2410.13714 [cs.LG] (Published 2024-10-17)
Generation through the lens of learning theory
arXiv:2011.03729 [cs.LG] (Published 2020-11-07)
Enhash: A Fast Streaming Algorithm For Concept Drift Detection