arXiv:2009.12966 Abstract | arXiv Analytics

arXiv:2009.12966 [cs.LG]Abstract References Reviews Resources

Analysis of label noise in graph-based semi-supervised learning

Bruno Klaus de Aquino Afonso, Lilian Berton

Published 2020-09-27Version 1

In machine learning, one must acquire labels to help supervise a model that will be able to generalize to unseen data. However, the labeling process can be tedious, long, costly, and error-prone. It is often the case that most of our data is unlabeled. Semi-supervised learning (SSL) alleviates that by making strong assumptions about the relation between the labels and the input data distribution. This paradigm has been successful in practice, but most SSL algorithms end up fully trusting the few available labels. In real life, both humans and automated systems are prone to mistakes; it is essential that our algorithms are able to work with labels that are both few and also unreliable. Our work aims to perform an extensive empirical evaluation of existing graph-based semi-supervised algorithms, like Gaussian Fields and Harmonic Functions, Local and Global Consistency, Laplacian Eigenmaps, Graph Transduction Through Alternating Minimization. To do that, we compare the accuracy of classifiers while varying the amount of labeled data and label noise for many different samples. Our results show that, if the dataset is consistent with SSL assumptions, we are able to detect the noisiest instances, although this gets harder when the number of available labels decreases. Also, the Laplacian Eigenmaps algorithm performed better than label propagation when the data came from high-dimensional clusters.

DOI: 10.1145/3341105.3374013

Categories: cs.LG, stat.ML

Keywords: label noise, graph-based semi-supervised learning, laplacian eigenmaps algorithm performed better, ssl algorithms end, input data distribution

Tags: journal article

Related articles: Most relevant | Search more

arXiv:1905.10769 [cs.LG] (Published 2019-05-26)

A Flexible Generative Framework for Graph-based Semi-supervised Learning

Jiaqi Ma, Weijing Tang, Ji Zhu, Qiaozhu Mei

arXiv:1309.6818 [cs.LG] (Published 2013-09-26)

Boosting in the presence of label noise

Jakramate Bootkrajang, Ata Kaban

arXiv:2011.03687 [cs.LG] (Published 2020-11-07)