A PENALIZED MODIFIED HUBER REGULARIZATION TO IMPROVE ADVERSARIAL ROBUSTNESS
Modeste Atsague, Ashutosh Nirala, Olukorede Fakorede, Jin Tian
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Adversarial training (AT) is a learning procedure that trains a deep neural network with adversary examples to improve robustness. AT and its variants are widely considered the most empirically successful against adversary examples. Along the same line, this work proposes a new training objective, PMHR-AT (Penalized Modified Huber Regularization for Adversarial training) for improving adversarial robustness. PMHR-AT minimizes both natural and adversarial risk and introduces a modified Huber loss between the natural and adversarial logits as a regularization with the regularization strength adjusted based on the similarity between the predicted natural and adversarial class probabilities. Experimental results show that the proposed method recorded a better performance than existing methods on strong attacks and offers a better trade-off between the natural accuracy and adversarial robustness.