arXiv:2207.06224 [cs.CV]AbstractReferencesReviewsResources
Beyond Hard Labels: Investigating data label distributions
Vasco Grossmann, Lars Schmarje, Reinhard Koch
Published 2022-07-13Version 1
High-quality data is a key aspect of modern machine learning. However, labels generated by humans suffer from issues like label noise and class ambiguities. We raise the question of whether hard labels are sufficient to represent the underlying ground truth distribution in the presence of these inherent imprecision. Therefore, we compare the disparity of learning with hard and soft labels quantitatively and qualitatively for a synthetic and a real-world dataset. We show that the application of soft labels leads to improved performance and yields a more regular structure of the internal feature space.
Related articles: Most relevant | Search more
arXiv:2303.16296 [cs.CV] (Published 2023-03-28)
Dice Semimetric Losses: Optimizing the Dice Score with Soft Labels
arXiv:2302.05666 [cs.CV] (Published 2023-02-11)
Jaccard Metric Losses: Optimizing the Jaccard Index with Soft Labels
arXiv:2401.05570 [cs.CV] (Published 2024-01-10)
Siamese Networks with Soft Labels for Unsupervised Lesion Detection and Patch Pretraining on Screening Mammograms