arXiv:2101.05544 Abstract | arXiv Analytics

arXiv:2101.05544 [cs.LG]Abstract References Reviews Resources

DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation

Published 2021-01-14Version 1

Deep ensembles perform better than a single network thanks to the diversity among their members. Recent approaches regularize predictions to increase diversity; however, they also drastically decrease individual members' performances. In this paper, we argue that learning strategies for deep ensembles need to tackle the trade-off between ensemble diversity and individual accuracies. Motivated by arguments from information theory and leveraging recent advances in neural estimation of conditional mutual information, we introduce a novel training criterion called DICE: it increases diversity by reducing spurious correlations among features. The main idea is that features extracted from pairs of members should only share information useful for target class prediction without being conditionally redundant. Therefore, besides the classification loss with information bottleneck, we adversarially prevent features from being conditionally predictable from each other. We manage to reduce simultaneous errors while protecting class information. We obtain state-of-the-art accuracy results on CIFAR-10/100: for example, an ensemble of 5 networks trained with DICE matches an ensemble of 7 networks trained independently. We further analyze the consequences on calibration, uncertainty estimation, out-of-distribution detection and online co-distillation.

Comments: Published as a conference paper at ICLR 2021. 9 main pages, 13 figures, 12 tables

Categories: cs.LG, cs.CV, cs.IT, math.IT

Keywords: conditional redundancy adversarial estimation, information, deep ensembles perform better, single network thanks, state-of-the-art accuracy results

Tags: conference paper

Related articles: Most relevant | Search more

arXiv:1605.01335 [cs.LG] (Published 2016-05-04)

Learning from the memory of Atari 2600

Jakub Sygnowski, Henryk Michalewski

arXiv:2105.08769 [cs.LG] (Published 2021-05-18)

Learning and Information in Stochastic Networks and Queues

Neil Walton, Kuang Xu

arXiv:2001.03780 [cs.LG] (Published 2020-01-11)

Intelligence, physics and information -- the tradeoff between accuracy and simplicity in machine learning

Tailin Wu

arXiv Analytics

arXiv:2101.05544 [cs.LG]Abstract References Reviews Resources

DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation

Links

Toolbox

arXiv:2101.05544 [cs.LG]AbstractReferencesReviewsResources

DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation

Links

Toolbox

arXiv:2101.05544 [cs.LG]Abstract References Reviews Resources