arXiv:2307.03305 Abstract | arXiv Analytics

arXiv:2307.03305 [cs.LG]Abstract References Reviews Resources

A Vulnerability of Attribution Methods Using Pre-Softmax Scores

Published 2023-07-06Version 1

We discuss a vulnerability involving a category of attribution methods used to provide explanations for the outputs of convolutional neural networks working as classifiers. It is known that this type of networks are vulnerable to adversarial attacks, in which imperceptible perturbations of the input may alter the outputs of the model. In contrast, here we focus on effects that small modifications in the model may cause on the attribution method without altering the model outputs.

Comments: 7 pages, 5 figures,

Categories: cs.LG, cs.AI

Subjects: 68T07, I.2.m

Keywords: attribution method, pre-softmax scores, vulnerability, model outputs, convolutional neural networks working

Related articles: Most relevant | Search more

arXiv:2306.13197 [cs.LG] (Published 2023-06-22)

Pre or Post-Softmax Scores in Gradient-based Attribution Methods, What is Best?

Miguel Lerma, Mirtha Lucas

arXiv:2103.13533 [cs.LG] (Published 2021-03-25)

Symmetry-Preserving Paths in Integrated Gradients

Miguel Lerma, Mirtha Lucas

arXiv:2104.11695 [cs.LG] (Published 2021-04-23)