Skip to main navigation Skip to search Skip to main content

Tracking the Leaker: An Encodable Watermarking Method for Dataset Intellectual Property Protection

  • Yifan Shang
  • , Mingfu Xue*
  • , Leo Yu Zhang
  • , Yushu Zhang
  • , Weiqiang Liu
  • *Corresponding author for this work
  • Nanjing University of Aeronautics and Astronautics
  • Griffith University Queensland

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Presently, numerous enterprises provide machine learning cloud services. However, the service provider may exploit user-uploaded data for unauthorized model retraining or illicit collection of user data for commercial model development. This study introduces a traceable dataset watermarking technique designed to ascertain the trustworthiness of third-party providers offering machine learning cloud services. In the event of a data breach, the source can be traced back to the suspicious third-party responsible for data leakage. Specifically, we propose a method that employs the clean-label backdoor attack framework to infer whether a third-party model is trained using user data. A watermark, associated with the encoding and designed as a trigger, is injected into the dataset through a trained autoencoder. Experimental evaluation on three datasets proves the effectiveness of the proposed method, yielding over 93% accuracy on average under normal conditions. A series of pruning and fine-tuning attacks were carried out on the method, with the results indicating that these attacks have a minimal impact and confirming the method's robustness.

Original languageEnglish
Title of host publicationProceedings of ACM Turing Award Celebration Conference - CHINA 2024, TURC 2024
PublisherAssociation for Computing Machinery
Pages114-119
Number of pages6
ISBN (Electronic)9798400710117
DOIs
StatePublished - 5 Jul 2024
Event2024 ACM Turing Award Celebration Conference China, TURC 2024 - Changsha, China
Duration: 5 Jul 20247 Jul 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2024 ACM Turing Award Celebration Conference China, TURC 2024
Country/TerritoryChina
CityChangsha
Period5/07/247/07/24

Keywords

  • Backdoor
  • Data Security
  • Dataset Watermarking
  • Deep Neural Networks
  • Intellectual Property Protection

Fingerprint

Dive into the research topics of 'Tracking the Leaker: An Encodable Watermarking Method for Dataset Intellectual Property Protection'. Together they form a unique fingerprint.

Cite this