arXiv:1911.11253 Abstract | arXiv Analytics

arXiv:1911.11253 [cs.LG]Abstract References Reviews Resources

Playing it Safe: Adversarial Robustness with an Abstain Option

Published 2019-11-25Version 1

We explore adversarial robustness in the setting in which it is acceptable for a classifier to abstain---that is, output no class---on adversarial examples. Adversarial examples are small perturbations of normal inputs to a classifier that cause the classifier to give incorrect output; they present security and safety challenges for machine learning systems. In many safety-critical applications, it is less costly for a classifier to abstain on adversarial examples than to give incorrect output for them. We first introduce a novel objective function for adversarial robustness with an abstain option which characterizes an explicit tradeoff between robustness and accuracy. We then present a simple baseline in which an adversarially-trained classifier abstains on all inputs within a certain distance of the decision boundary, which we theoretically and experimentally evaluate. Finally, we propose Combined Abstention Robustness Learning (CARL), a method for jointly learning a classifier and the region of the input space on which it should abstain. We explore different variations of the PGD and DeepFool adversarial attacks on CARL in the abstain setting. Evaluating against these attacks, we demonstrate that training with CARL results in a more accurate, robust, and efficient classifier than the baseline.

Categories: cs.LG, cs.AI, stat.ML

Keywords: adversarial robustness, abstain option, incorrect output, deepfool adversarial attacks, class-on adversarial examples

Related articles: Most relevant | Search more

arXiv:2006.10885 [cs.LG] (Published 2020-06-18)

The Dilemma Between Dimensionality Reduction and Adversarial Robustness

Sheila Alemany, Niki Pissinou

arXiv:2102.11069 [cs.LG] (Published 2021-02-19)

A PAC-Bayes Analysis of Adversarial Robustness

Guillaume Vidot, Paul Viallard, Amaury Habrard, Emilie Morvant

arXiv:2108.10451 [cs.LG] (Published 2021-08-24)

Adversarial Robustness of Deep Learning: Theory, Algorithms, and Applications

Wenjie Ruan, Xinping Yi, Xiaowei Huang