Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks

Weiran Lin*, Keane Lucas*, Lujo Bauer*, Michael K. Reiter*, Mahmood Sharif*

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

We propose new, more efficient targeted white-box attacks against deep neural networks. Our attacks better align with the attacker's goal: (1) tricking a model to assign higher probability to the target class than to any other class, while (2) staying within an ε-distance of the attacked input. First, we demonstrate a loss function that explicitly encodes (1) and show that Auto-PGD finds more attacks with it. Second, we propose a new attack method, Constrained Gradient Descent (CGD), using a refinement of our loss function that captures both (1) and (2). CGD seeks to satisfy both attacker objectives-misclassification and bounded ℓp-norm-in a principled manner, as part of the optimization, instead of via ad hoc post-processing techniques (e.g., projection or clipping). We show that CGD is more successful on CIFAR10 (0.9-4.2%) and ImageNet (8.6-13.6%) than state-of-the-art attacks while consuming less time (11.4-18.8%). Statistical tests confirm that our attack outperforms others against leading defenses on different datasets and values of ε.

Original languageEnglish
Pages (from-to)13405-13430
Number of pages26
JournalProceedings of Machine Learning Research
Volume162
StatePublished - 2022
Event39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States
Duration: 17 Jul 202223 Jul 2022

Funding

FundersFunder number
Blavatnik Family Foundation
Department of Defense
Maof prize for excellent young faculty
NSF
National Security AgencyH9823018D0008
National Science Foundation2113345, 2112562, 1801391
U.S. Department of DefenseFA8702-15-D-0002

    Fingerprint

    Dive into the research topics of 'Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks'. Together they form a unique fingerprint.

    Cite this