Abstract
Fake audio detection is expected to become an important research area in the field of smart speakers such as Google Home, Amazon Echo and chatbots developed for these platforms. This paper presents replay attack vulnerability of voice-driven interfaces and proposes a countermeasure to detect replay attack on these platforms. This paper presents a novel framework to model replay attack distortion, and then use a non-learning-based method for replay attack detection on smart speakers. The reply attack distortion is modeled as a higher-order nonlinearity in the replay attack audio. Higher-order spectral analysis (HOSA) is used to capture characteristics distortions in the replay audio. Effectiveness of the proposed countermeasure scheme is evaluated on original speech as well as corresponding replayed recordings. The replay attack recordings are successfully injected into the Google Home device via Amazon Alexa using the drop-in conferencing feature.
Abstract (translated)
假音频检测有望成为谷歌家庭、亚马逊Echo、聊天机器人等智能扬声器领域的一个重要研究领域。本文介绍了语音驱动接口的重放攻击漏洞,并提出了检测这些平台上重放攻击的对策。提出了一种新的重放攻击失真模型,并采用非学习的方法对智能扬声器进行重放攻击检测。在重放攻击音频中,将应答攻击失真建模为高阶非线性。高阶谱分析(hosa)用于捕获重放音频中的特征失真。通过原始语音和相应的回放记录,评价了所提出的对策方案的有效性。重播攻击记录通过AmazonAlexa使用内嵌式会议功能成功地注入到谷歌家庭设备中。
URL
https://arxiv.org/abs/1904.06591