Huili Chen(University of California, San Diego), Bita Darvish Rouhani(Microsoft Research) and Farinaz Koushanfar(University of California San Diego)
Automatic Speech Recognition (ASR) systems are widely deployed in various applications due to their superior performance. However, obtaining a highly accurate ASR model is non-trivial since it requires the availability of a massive amount of proprietary training data and enormous computational resources. As such, pre-trained ASR models shall be considered as the intellectual property (IP) of the model designer and protected against copyright infringement attacks. In this paper, we propose SpecMark, the first spectral watermarking framework that seamlessly embeds a watermark (WM) in the spectrum of the ASR model for ownership proof. SpecMark identifies the significant frequency components of the model parameters and encodes the owner’s WM in the corresponding spectrum region before sharing the model with end-users. The model builder can later extract the spectral WM to verify his ownership of the marked ASR system. We evaluate SpecMark’s performance using DeepSpeech model with three different speech datasets. Empirical results corroborate that SpecMark incurs negligible overhead and preserves the recognition accuracy of the original system. Furthermore, SpecMark sustains diverse model modifications, including parameter pruning and transfer learning.