arXiv Analytics

Sign in

arXiv:2303.08360 [cs.CV]AbstractReferencesReviewsResources

Knowledge Distillation from Single to Multi Labels: an Empirical Study

Youcai Zhang, Yuzhuo Qin, Hengwei Liu, Yanhao Zhang, Yaqian Li, Xiaodong Gu

Published 2023-03-15Version 1

Knowledge distillation (KD) has been extensively studied in single-label image classification. However, its efficacy for multi-label classification remains relatively unexplored. In this study, we firstly investigate the effectiveness of classical KD techniques, including logit-based and feature-based methods, for multi-label classification. Our findings indicate that the logit-based method is not well-suited for multi-label classification, as the teacher fails to provide inter-category similarity information or regularization effect on student model's training. Moreover, we observe that feature-based methods struggle to convey compact information of multiple labels simultaneously. Given these limitations, we propose that a suitable dark knowledge should incorporate class-wise information and be highly correlated with the final classification results. To address these issues, we introduce a novel distillation method based on Class Activation Maps (CAMs), which is both effective and straightforward to implement. Across a wide range of settings, CAMs-based distillation consistently outperforms other methods.

Related articles: Most relevant | Search more
arXiv:2211.03946 [cs.CV] (Published 2022-11-08)
Understanding the Role of Mixup in Knowledge Distillation: \\An Empirical Study
arXiv:1907.09643 [cs.CV] (Published 2019-07-23)
Highlight Every Step: Knowledge Distillation via Collaborative Teaching
arXiv:1904.01802 [cs.CV] (Published 2019-04-03)
Correlation Congruence for Knowledge Distillation
Baoyun Peng et al.