arXiv Analytics

Sign in

arXiv:2308.03792 [cs.CV]AbstractReferencesReviewsResources

Multi-attacks: Many images $+$ the same adversarial attack $\to$ many target labels

Stanislav Fort

Published 2023-08-04Version 1

We show that we can easily design a single adversarial perturbation $P$ that changes the class of $n$ images $X_1,X_2,\dots,X_n$ from their original, unperturbed classes $c_1, c_2,\dots,c_n$ to desired (not necessarily all the same) classes $c^*_1,c^*_2,\dots,c^*_n$ for up to hundreds of images and target classes at once. We call these \textit{multi-attacks}. Characterizing the maximum $n$ we can achieve under different conditions such as image resolution, we estimate the number of regions of high class confidence around a particular image in the space of pixels to be around $10^{\mathcal{O}(100)}$, posing a significant problem for exhaustive defense strategies. We show several immediate consequences of this: adversarial attacks that change the resulting class based on their intensity, and scale-independent adversarial examples. To demonstrate the redundancy and richness of class decision boundaries in the pixel space, we look for its two-dimensional sections that trace images and spell words using particular classes. We also show that ensembling reduces susceptibility to multi-attacks, and that classifiers trained on random labels are more susceptible. Our code is available on GitHub.

Comments: Code at https://github.com/stanislavfort/multi-attacks
Categories: cs.CV, cs.CR, cs.LG
Related articles: Most relevant | Search more
arXiv:2210.05968 [cs.CV] (Published 2022-10-12)
Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation
arXiv:2103.09448 [cs.CV] (Published 2021-03-17)
Adversarial Attacks on Camera-LiDAR Models for 3D Car Detection
arXiv:2107.06501 [cs.CV] (Published 2021-07-14)
AdvFilter: Predictive Perturbation-aware Filtering against Adversarial Attack via Multi-domain Learning