Audio Adversarial Examples: Attacks Using Vocal Masks

2021-02-04 05:21:10

Lynnette Ng, Kai Yuan Tay, Wei Han Chua, Lucerne Loke, Danqi Ye, Melissa Chua

arXiv_AI

arXiv_AI Adversarial Speech

Abstract
Abstract (translated)
URL
PDF

Abstract

We construct audio adversarial examples on automatic Speech-To-Text systems . Given any audio waveform, we produce an another by overlaying an audio vocal mask generated from the original audio. We apply our audio adversarial attack to five SOTA STT systems: DeepSpeech, Julius, Kaldi, wav2letter@anywhere and CMUSphinx. In addition, we engaged human annotators to transcribe the adversarial audio. Our experiments show that these adversarial examples fool State-Of-The-Art Speech-To-Text systems, yet humans are able to consistently pick out the speech. The feasibility of this attack introduces a new domain to study machine and human perception of speech.

Abstract (translated)

URL

https://arxiv.org/abs/2102.02417

PDF

https://arxiv.org/pdf/2102.02417.pdf