arXiv:1810.01279 Abstract | arXiv Analytics

arXiv:1810.01279 [cs.LG]Abstract References Reviews Resources

Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network

Xuanqing Liu, Yao Li, Chongruo Wu, Cho-Jui Hsieh

Published 2018-10-01Version 1

We present a new algorithm to train a robust neural network against adversarial attacks. Our algorithm is motivated by the following two ideas. First, although recent work has demonstrated that fusing randomness can improve the robustness of neural networks (Liu 2017), we noticed that adding noise blindly to all the layers is not the optimal way to incorporate randomness. Instead, we model randomness under the framework of Bayesian Neural Network (BNN) to formally learn the posterior distribution of models in a scalable way. Second, we formulate the mini-max problem in BNN to learn the best model distribution under adversarial attacks, leading to an adversarial-trained Bayesian neural net. Experiment results demonstrate that the proposed algorithm achieves state-of-the-art performance under strong attacks. On CIFAR-10 with VGG network, our model leads to 14\% accuracy improvement compared with adversarial training (Madry 2017) and random self-ensemble (Liu 2017) under PGD attack with $0.035$ distortion, and the gap becomes even larger on a subset of ImageNet.

Comments: Code will be made available at https://github.com/xuanqing94/BayesianDefense

Categories: cs.LG, cs.AI, cs.CR, stat.ML

Keywords: robust bayesian neural network, adversarial defense, algorithm achieves state-of-the-art performance, adversarial attacks, adversarial-trained bayesian neural net

Tags: github project

Related articles: Most relevant | Search more

arXiv:1909.08072 [cs.LG] (Published 2019-09-17)

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Han Xu, Yao Ma, Haochen Liu, Debayan Deb, Hui Liu, Jiliang Tang, Anil Jain

arXiv:1702.02284 [cs.LG] (Published 2017-02-08)

Adversarial Attacks on Neural Network Policies

Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter Abbeel

arXiv:2007.06381 [cs.LG] (Published 2020-07-13)