Toward Stealthy Backdoor Attacks Against Speech Recognition via Elements of Sound

Hanbo Cai, Pengcheng Zhang*, Hai Dong, Yan Xiao, Stefanos Koffas, Yiming Li

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

Abstract

Deep neural networks (DNNs) have been widely and successfully adopted and deployed in various applications of speech recognition. Recently, a few works revealed that these models are vulnerable to backdoor attacks, where the adversaries can implant malicious prediction behaviors into victim models by poisoning their training process. In this paper, we revisit poison-only backdoor attacks against speech recognition. We reveal that existing methods are not stealthy since their trigger patterns are perceptible to humans or machine detection. This limitation is mostly because their trigger patterns are simple noises or separable and distinctive clips. Motivated by these findings, we propose to exploit elements of sound ( e.g ., pitch and timbre) to design more stealthy yet effective poison-only backdoor attacks. Specifically, we insert a short-duration high-pitched signal as the trigger and increase the pitch of remaining audio clips to ‘mask’ it for designing stealthy pitch-based triggers. We manipulate timbre features of victim audio to design the stealthy timbre-based attack and design a voiceprint selection module to facilitate the multi-backdoor attack. Our attacks can generate more ‘natural’ poisoned samples and therefore are more stealthy. Extensive experiments are conducted on benchmark datasets, which verify the effectiveness of our attacks under different settings ( e.g ., all-to-one, all-to-all, clean-label, physical, and multi-backdoor settings) and their stealthiness. Our methods achieve attack success rates of over 95% in most cases and are nearly undetectable. The code for reproducing main experiments are available at https://github.com/HanboCai/BadSpeech_SoE .
Original languageEnglish
Pages (from-to)5852-5866
Number of pages15
JournalIEEE Transactions on Information Forensics and Security
Volume19
DOIs
Publication statusPublished - 2024

Bibliographical note

Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care
Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Keywords

  • Backdoor attack
  • backdoor learning
  • speech recognition
  • AI security
  • trustworthy ML

Fingerprint

Dive into the research topics of 'Toward Stealthy Backdoor Attacks Against Speech Recognition via Elements of Sound'. Together they form a unique fingerprint.

Cite this