arXiv:2005.10284 Abstract | arXiv Analytics

arXiv:2005.10284 [cs.LG]Abstract References Reviews Resources

An Adversarial Approach for Explaining the Predictions of Deep Neural Networks

Published 2020-05-20Version 1

Machine learning models have been successfully applied to a wide range of applications including computer vision, natural language processing, and speech recognition. A successful implementation of these models however, usually relies on deep neural networks (DNNs) which are treated as opaque black-box systems due to their incomprehensible complexity and intricate internal mechanism. In this work, we present a novel algorithm for explaining the predictions of a DNN using adversarial machine learning. Our approach identifies the relative importance of input features in relation to the predictions based on the behavior of an adversarial attack on the DNN. Our algorithm has the advantage of being fast, consistent, and easy to implement and interpret. We present our detailed analysis that demonstrates how the behavior of an adversarial attack, given a DNN and a task, stays consistent for any input test data point proving the generality of our approach. Our analysis enables us to produce consistent and efficient explanations. We illustrate the effectiveness of our approach by conducting experiments using a variety of DNNs, tasks, and datasets. Finally, we compare our work with other well-known techniques in the current literature.

Categories: cs.LG, cs.AI, stat.ML

Keywords: deep neural networks, adversarial approach, predictions, adversarial attack, input test data point proving

Related articles: Most relevant | Search more

arXiv:2001.07769 [cs.LG] (Published 2020-01-21)

Massif: Interactive Interpretation of Adversarial Attacks on Deep Learning

Nilaksh Das et al.

arXiv:1811.11705 [cs.LG] (Published 2018-11-28)

An Adversarial Approach for Explainable AI in Intrusion Detection Systems

Daniel L. Marino, Chathurika S. Wickramasinghe, Milos Manic

arXiv:1909.11835 [cs.LG] (Published 2019-09-26)

GAMIN: An Adversarial Approach to Black-Box Model Inversion

Ulrich Aïvodji, Sébastien Gambs, Timon Ther

arXiv Analytics

arXiv:2005.10284 [cs.LG]Abstract References Reviews Resources

An Adversarial Approach for Explaining the Predictions of Deep Neural Networks

Links

Toolbox

arXiv:2005.10284 [cs.LG]AbstractReferencesReviewsResources

An Adversarial Approach for Explaining the Predictions of Deep Neural Networks

Links

Toolbox

arXiv:2005.10284 [cs.LG]Abstract References Reviews Resources