Persistent Watermark For Image Classification Neural Networks By Penetrating The Autoencoder
Fang-Qi Li, Shi-Lin Wang
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:12:33
Deep neural networks for image processing, especially image classification, have become ubiquitous. To protect them as intellectual properties and standardize the commercialization of their service, watermarking schemes have been proposed to authenticate the author of models. Many black-box watermarking schemes insert a backdoor into the neural network by poisoning the training dataset. Their performance declines if the adversary who has stolen the model adds a noise reducer, in particular an autoencoder, to ruin the backdoor. To cope with this kind of piracy, we propose an enhanced watermarking scheme by using triggers that penetrates the adversary's autoencoder. The penetrative triggers are generated from a collection of shadow models that approximate the adversary's autoencoder, which is assumed to be hidden from the genuine host of the model. The proposed scheme is shown to be resistant to the filtering of autoencoders and significantly increase the robustness of copyright authentication.