arXiv Analytics

Sign in

arXiv:2310.03833 [cs.LG]AbstractReferencesReviewsResources

Learning A Disentangling Representation For PU Learning

Omar Zamzam, Haleh Akrami, Mahdi Soltanolkotabi, Richard Leahy

Published 2023-10-05Version 1

In this paper, we address the problem of learning a binary (positive vs. negative) classifier given Positive and Unlabeled data commonly referred to as PU learning. Although rudimentary techniques like clustering, out-of-distribution detection, or positive density estimation can be used to solve the problem in low-dimensional settings, their efficacy progressively deteriorates with higher dimensions due to the increasing complexities in the data distribution. In this paper we propose to learn a neural network-based data representation using a loss function that can be used to project the unlabeled data into two (positive and negative) clusters that can be easily identified using simple clustering techniques, effectively emulating the phenomenon observed in low-dimensional settings. We adopt a vector quantization technique for the learned representations to amplify the separation between the learned unlabeled data clusters. We conduct experiments on simulated PU data that demonstrate the improved performance of our proposed method compared to the current state-of-the-art approaches. We also provide some theoretical justification for our two cluster-based approach and our algorithmic choices.

Related articles: Most relevant | Search more
arXiv:1811.04820 [cs.LG] (Published 2018-11-12)
Learning From Positive and Unlabeled Data: A Survey
arXiv:1809.05710 [cs.LG] (Published 2018-09-15)
Alternate Estimation of a Classifier and the Class-Prior from Positive and Unlabeled Data
arXiv:1911.08696 [cs.LG] (Published 2019-11-20)
Where is the Bottleneck of Adversarial Learning with Unlabeled Data?