Fragile Neural Network Watermarking with Trigger Image Set

  • Renjie Zhu
  • , Ping Wei
  • , Sheng Li
  • , Zhaoxia Yin
  • , Xinpeng Zhang*
  • , Zhenxing Qian
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

28 Scopus citations

Abstract

Recent studies show that deep neural networks are vulnerable to data poisoning and backdoor attacks, both of which involve malicious fine tuning of deep models. In this paper, we first propose a black-box based fragile neural network watermarking method for the detection of malicious fine tuning. The watermarking process can be divided into three steps. Firstly, a set of trigger images is constructed based on a user-specific secret key. Then, a well trained DNN model is fine-tuned to classify the normal images in training set and trigger images in trigger set simultaneously in a two-stage alternate training manner. Fragile watermark is embedded by this means while keeping model’s original classification ability. The watermarked model is sensitive to malicious fine tuning and will produce unstable classification results of the trigger images. At last, the integrity of the network model can be verified by analyzing the output of watermarked model with the trigger image set as input. The experiments on three benchmark datasets demonstrate that our proposed watermarking method is effective in detecting malicious fine tuning.

Original languageEnglish
Title of host publicationKnowledge Science, Engineering and Management - 14th International Conference, KSEM 2021, Proceedings
EditorsHan Qiu, Cheng Zhang, Zongming Fei, Meikang Qiu, Sun-Yuan Kung
PublisherSpringer Science and Business Media Deutschland GmbH
Pages280-293
Number of pages14
ISBN (Print)9783030821357
DOIs
StatePublished - 2021
Externally publishedYes
Event14th International Conference on Knowledge Science, Engineering and Management, KSEM 2021 - Tokyo, Japan
Duration: 14 Aug 202116 Aug 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12815 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th International Conference on Knowledge Science, Engineering and Management, KSEM 2021
Country/TerritoryJapan
CityTokyo
Period14/08/2116/08/21

Keywords

  • Backdoor defense
  • Data poisoning
  • Fragile watermarking
  • Malicious tuning detection
  • Model integrity protection
  • Neural network

Fingerprint

Dive into the research topics of 'Fragile Neural Network Watermarking with Trigger Image Set'. Together they form a unique fingerprint.

Cite this