A Neuro-Inspired Autoencoding Defense Against Adversarial Attacks
Can Bakiskan, Metehan Cekic, Ahmet Sezer, Upamanyu Madhow
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:10:52
Deep Neural Networks (DNNs) are vulnerable to adversarial attacks: carefully constructed perturbations to an image can seriously impair classification accuracy, while being imperceptible to humans. The most effective current defense is to train the network using adversarially perturbed examples. In this paper, we investigate a radically different, neuro-inspired defense mechanism, aiming to reject adversarial perturbations before they reach a classifier DNN, using an encoder with characteristics commonly observed in biological vision, followed by a decoder restoring image dimensions that can be cascaded with standard CNN architectures. Unlike adversarial training, all training is based on clean images. Our experiments on the CIFAR-10 and a subset of Imagenet datasets show performance competitive with state-of-the-art adversarial training, and point to the promise of bottom-up neuro-inspired techniques for the design of robust neural networks.