ACHIEVING OPTIMAL ADVERSARIAL ACCURACY FOR ADVERSARIAL DEEP LEARNING USING STACKELBERG GAMES

Xiao-shan GAO; Shuang LIU; Lijia YU

doi:10.1007/s10473-022-0613-y

Acta mathematica scientia, Series B >

2022 , Vol. 42 >Issue 6: 2399 - 2418

DOI: https://doi.org/10.1007/s10473-022-0613-y

Articles

ACHIEVING OPTIMAL ADVERSARIAL ACCURACY FOR ADVERSARIAL DEEP LEARNING USING STACKELBERG GAMES

Xiao-shan GAO ,
Shuang LIU ,
Lijia YU

Expand

Academy of Mathematics and Systems Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing 100190, China

Received date: 2022-07-17

Online published: 2022-12-16

Supported by

This work was partially supported by NSFC (12288201) and NKRDP grant (2018YFA0704705).

Fold

Abstract

The purpose of adversarial deep learning is to train robust DNNs against adversarial attacks, and this is one of the major research focuses of deep learning. Game theory has been used to answer some of the basic questions about adversarial deep learning, such as those regarding the existence of a classifier with optimal robustness and the existence of optimal adversarial samples for a given class of classifiers. In most previous works, adversarial deep learning was formulated as a simultaneous game and the strategy spaces were assumed to be certain probability distributions in order for the Nash equilibrium to exist. However, this assumption is not applicable to practical situations. In this paper, we give answers to these basic questions for the practical case where the classifiers are DNNs with a given structure; we do that by formulating adversarial deep learning in the form of Stackelberg games. The existence of Stackelberg equilibria for these games is proven. Furthermore, it is shown that the equilibrium DNN has the largest adversarial accuracy among all DNNs with the same structure, when Carlini-Wagner’s margin loss is used. The trade-off between robustness and accuracy in adversarial deep learning is also studied from a game theoretical perspective.

Key words： adversarial deep learning; Stackelberg game; optimal robust DNN; universal adversarial attack; adversarial accuracy; trade-off result

Cite this article

Xiao-shan GAO , Shuang LIU , Lijia YU . ACHIEVING OPTIMAL ADVERSARIAL ACCURACY FOR ADVERSARIAL DEEP LEARNING USING STACKELBERG GAMES[J]. Acta mathematica scientia, Series B, 2022 , 42(6) : 2399 -2418 . DOI: 10.1007/s10473-022-0613-y

References

[1] Athalye A, Carlini N, Wagner D. Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. Proc ICML, PMLR, 2018, 80: 274-283
[2] Athalye A, Engstrom L, Ilyas A, Kwok K. Synthesizing robust adversarial examples. Proc Machine Learning Research, 2018
[3] Azulay A, Weiss Y. Why do deep convolutional networks generalize so poorly to small image transformations? Journal of Machine Learning Research, 2019, 20: 1-25
[4] Bastounis A, Hansen A C, Vlačić V. The mathematics of adversarial attacks in AI - Why deep learning is unstable despite the existence of stable neural networks. arXiv:2109.06098, 2021
[5] Bose J, Gidel G, Berard H, Cianflone A, Vincent P, Lacoste-Julien S, Hamilton W. Adversarial example games. Proc NeurIPS, 2020
[6] Carlini N, Wagner D. Towards evaluating the robustness of neural networks. Proc of IEEE Symposium on Security and Privacy, IEEE Press, 2017: 39-57
[7] Carlini N, Wagner D. Adversarial examples are not easily detected: bypassing ten detection methods. Proc 10th ACM Workshop on Artificial Intelligence and Security, 2017: 3-14
[8] Chivukula A S, Yang X, Liu W, Zhu T, Zhou W. Game theoretical adversarial deep learning with variational adversaries. IEEE Trans Knowledge and Data Engineering, 2021, 33(11): 3568-3581
[9] Cohen J, Rosenfeld E, Kolter Z. Certified adversarial robustness via randomized smoothing. Proc ICML, PMLR, 2019: 1310-1320
[10] Colbrook M J, Antun V, Hansen A C. The difficulty of computing stable and accurate neural networks: on the barriers of deep learning and smale’s 18th problem. Proc of the National Academy of Sciences, 2022, 119(12): e2107151119
[11] Dalvi N, Domingos P, Mausam S, Verma D. Adversarial classification. Proc KDD’04. New York: ACM Press, 2004: 99-108
[12] van Damme E. Stability and Perfection of Nash Equilibia. Springer, 1987
[13] Fiez T, Chasnov B, Ratliff L J. Implicit learning dynamics in stackelberg games: equilibria characterization, convergence analysis, and empirical study. Proc ICML, PMLR, 2020
[14] Fudenberg D, Tirole J. Game Theory. Cambridge, MA: MIT Press, 1991
[15] Glicksberg I L. A further generalization of the kakutani fixed point theorem, with application to nash equilibrium points. Proc AMS, 1952, 3(1): 170-174
[16] Gidel G, Balduzzi D, Czarnecki W M, Garnelo M, Bachrach Y. Minimax theorem for latent games or: how I learned to stop worrying about mixed-Nash and love neural nets. arXiv:2002.05820v1, 2020
[17] Koh P W, Liang P. Understanding black-box predictions via influence functions. Proc ICML, PMLR, 2017: 1885-1894
[18] Hsieh Y P, Liu C, Cevher V. Finding mixed nash equilibria of generative adversarial networks. Proc ICML, PMLR, 2019
[19] Jin C, Netrapalli P, Jordan M I. What is local optimality in nonconvex-nonconcave minimax optimization? Proc ICML, PMLR, 2020
[20] Kamhoua C A, Kiekintveld C D, Fang F, Zhu Q, eds. Game theory and machine learning for cyber security. IEEE Press and Wiley, 2021
[21] Kurakin A, Goodfellow I, Bengio S. Adversarial examples in the physical world. ArXiv: 1607.02533, 2016
[22] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436-444
[23] Liu Y, Wei L, Luo B, Xu Q. Fault injection attack on deep neural network. Proc of the IEEE/ACM International Conference on Computer-Aided Design, 2017: 131-138
[24] Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. ArXiv:1706.06083, 2017
[25] Meunier L, Scetbon M, Pinot R, Atif J, Chevaleyre Y. Mixed Nash equilibria in the adversarial examples game. Proc ICML, PMLR, 2021: 139
[26] Montúfar G, Pascanu R, Cho K, Bengio Y. On the number of linear regions of deep neural networks. Proc NIPS’2014, 2014
[27] Moosavi-Dezfooli S M, Fawzi A, Fawzi O, Frossard P. Universal adversarial perturbations. Proc CVPR, 2017: 86-94
[28] Neyshabur B, Tomioka R, Srebro N. Norm-based capacity control in neural networks. Proc COLT’15, 2015: 1376-1401
[29] Pal A, Vidal R. A game theoretic analysis of additive adversarial attacks and defenses. Proc NeurIPS, 2020
[30] Papernot N, McDaniel P, Jha S, Fredrikson M, Celik Z B, Swami A. The limitations of deep learning in adversarial settings//IEEE European Symposium on Security and Privacy. IEEE Press, 2016: 372-387
[31] Papernot N, McDaniel P, Goodfellow I, Jha S, Celik Z B, Swami A. Practical black-box attacks against machine learning//Proc ACM on Asia Conference on Computer and Communications Security. ACM Press, 2017: 506-519
[32] Pinot R, Ettedgui R, Rizk G, Chevaleyre Y, Atif J. Randomization matters: how to defend against strong adversarial attacks. Proc ICML, PMLR, 2020
[33] Pydi M S, Jog V. Adversarial risk via optimal transport and optimal couplings. Proc ICML, PMLR, 2020
[34] Oliehoek F A, Savani R, Gallego J, van der Pol E, GroßR. Beyond local Nash equilibria for adversarial networks. Comm Comput Inform Sci, 2019, 1021: 73-89
[35] Ren J, Zhang D, Wangb Y, Chen L, Zhou Z, Chen Y, Cheng X, Wang X, Zhoua M, Shi J, Zhang Q. A unified game-theoretic interpretation of adversarial robustness. arXiv:2103.07364v2, 2021
[36] Shafahi A, Huang W R, Najibi M, Suciu O, Studer C, Dumitras T, Goldstein T. Poison frogs! targeted clean-label poisoning attacks on neural networks. Proc NeurIPS, 2018: 6103-6113
[37] Shafahi A, Huang W R, Studer C, Feizi S, Goldstein T. Are adversarial examples inevitable? ArXiv:1809.02104, 2018
[38] Shoham Y, Leyton-Brown K. Multiagent Systems: Algorithmic, Game Theoretic and Logical Foundations. Cambridge University Press, 2008
[39] Simaan M, Cruz Jr J B. On the stackelberg strategy in nonzero-sum games. Journal of Optimization Theory and Applications, 1973, 11: 533-555
[40] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I J, Fergus R. Intriguing properties of neural networks. ArXiv:1312.6199, 2013
[41] Tsai Y L, Hsu C Y, Yu C M, Chen P Y. Formalizing generalization and robustness of neural networks to weight perturbations. arXiv:2103.02200, 2021
[42] Tsipras D, Santurkar S, Engstrom L, Turner A, Madry A. Robustness may be at odds with accuracy. Proc ICML, PMLR, 2019
[43] Wu W T, Jiang J H. Essential equilibrium points of n-person noncooperative games. Scientia Sinica, 1962, 11(10): 1307-1322
[44] Xu H, Ma Y, Liu H C, Deb D, Liu H, Tang J L, Jain A K. Adversarial attacks and defenses in images, graphs and text: a review. International Journal of Automation and Computing, 2020, 17(2): 151-178
[45] Yang Y Y, Rashtchian C, Zhang H, Salakhutdinov R, Chaudhuri K. A closer look at accuracy vs robustness. Proc Neur IPS, 2020
[46] Yu L, Gao X S. Improve the robustness and accuracy of deep neural network with L_2,∞ normalization. Accepted by Journal of Systems Science and Complexity, 2022. arXiv:2010.04912
[47] Yu L, Wang Y, Gao X S. Adversarial parameter attack on deep neural networks. arXiv:2203.10502, 2022
[48] Yu L, Gao X S. Robust and information-theoretically safe bias classifier against adversarial attacks. arXiv:2111.04404, 2021
[49] Zhang H, Yu Y, Jiao J, Xing E P, Ghaoui L E, Jordan M I. Theoretically principled trade-off between robustness and accuracy. Proc ICML, PMLR, 2019
[50] Zhou Y, Kantarcioglu M, Xi B. A survey of game theoretic approach for adversarial machine learning. WIREs Data Mining Knowl Discov, 2019, 9(3): e1259

Options

Abstract

Outlines

模态框（Modal）标题

Abstract

Cite this article

References