arXiv:1705.06452 [stat.ML]AbstractReferencesReviewsResources
Delving into adversarial attacks on deep policies
Published 2017-05-18Version 1
Adversarial examples have been shown to exist for a variety of deep learning architectures. Deep reinforcement learning has shown promising results on training agent policies directly on raw inputs such as image pixels. In this paper we present a novel study into adversarial attacks on deep reinforcement learning polices. We compare the effectiveness of the attacks using adversarial examples vs. random noise. We present a novel method for reducing the number of times adversarial examples need to be injected for a successful attack, based on the value function. We further explore how re-training on random noise and FGSM perturbations affects the resilience against adversarial examples.
Comments: ICLR 2017 Workshop
Related articles: Most relevant | Search more
arXiv:1805.10652 [stat.ML] (Published 2018-05-27)
Defending Against Adversarial Attacks by Leveraging an Entire GAN
arXiv:2206.03353 [stat.ML] (Published 2022-06-07)
Adaptive Regularization for Adversarial Training
arXiv:1906.00230 [stat.ML] (Published 2019-06-01)
Disentangling Improves VAEs' Robustness to Adversarial Attacks