Improving Transferability of Generated Universal Adversarial Perturbations for Image Classification and Segmentation

Atiye Sadat Hashemi, Andreas Bär, Saeed Mozaffari, Tim Fingscheidt

June, 2022

Abstract

Although deep neural networks (DNNs) are high-performance methods for various complex tasks, e.g., environment perception in automated vehicles (AVs), they are vulnerable to adversarial perturbations. Recent works have proven the existence of universal adversarial perturbations (UAPs), which, when added to most images, destroy the output of the respective perception function. Existing attack methods often show a low success rate when attacking target models which are different from the one that the attack was optimized on. To address such weak transferability, we propose a novel learning criterion by combining a low-level feature loss, addressing the similarity of feature representations in the first layer of various model architectures, with a cross-entropy loss. Experimental results on ImageNet and Cityscapes datasets show that our method effectively generates universal adversarial perturbations achieving state-of-the-art fooling rates across different models, tasks, and datasets. Due to their effectiveness, we propose the use of such novel generated UAPs in robustness evaluation of DNN-based environment perception functions for AVs.

Type

Book section

Publication

In Fingscheidt, T., Gottschalk, H., Houben, S. (eds) Deep Neural Networks and Data for Automated Driving

Co-Author

Improving Transferability of Generated Universal Adversarial Perturbations for Image Classification and Segmentation

Abstract

Andreas Bär

PhD Student @