Abstract
Acoustic sensing manifests great potential in various applications that encompass health monitoring, gesture interface and imaging by leveraging the speakers and microphones on smart devices. However, in ongoing research and development in acoustic sensing, one problem is often overlooked: the same speaker, when used concurrently for sensing and other traditional applications (like playing music), could cause interference in both making it impractical to use in the real world. The strong ultrasonic sensing signals mixed with music would overload the speaker's mixer. To confront this issue of overloaded signals, current solutions are clipping or down-scaling, both of which affect the music playback quality and also sensing range and accuracy. To address this challenge, we propose CoPlay, a deep learning based optimization algorithm to cognitively adapt the sensing signal. It can 1) maximize the sensing signal magnitude within the available bandwidth left by the concurrent music to optimize sensing range and accuracy and 2) minimize any consequential frequency distortion that can affect music playback. In this work, we design a deep learning model and test it on common types of sensing signals (sine wave or Frequency Modulated Continuous Wave FMCW) as inputs with various agnostic concurrent music and speech. First, we evaluated the model performance to show the quality of the generated signals. Then we conducted field studies of downstream acoustic sensing tasks in the real world. A study with 12 users proved that respiration monitoring and gesture recognition using our adapted signal achieve similar accuracy as no-concurrent-music scenarios, while clipping or down-scaling manifests worse accuracy. A qualitative study also manifests that the music play quality is not degraded, unlike traditional clipping or down-scaling methods.
Abstract (translated)
声波感知在各种应用中具有很大的潜力,包括健康监测、手势界面和图像感知,通过利用智能设备上的扬声器和麦克风。然而,在声波感知的持续研究和开发中,一个问题常常被忽视:当同一扬声器用于感知和其他传统应用(如播放音乐)时,可能会导致其在现实世界中的干扰,使得它在实际应用中无法使用。强大的超声波感知信号与音乐混合会使扬声器的混频器过载。为了应对过载信号的问题,现有的解决方案是截断或降维,这两者都会影响音乐播放质量和感知范围与准确性。为了应对这个挑战,我们提出了CoPlay,一种基于深度学习的优化算法,以认知地适应感知信号。它可以:1)在可用的带宽范围内最大化感知信号的幅值,以优化感知范围和准确性;2)最小化可能影响音乐播放的任何后续频率畸变。在这篇工作中,我们设计了一个深度学习模型,并将其在各种类型的感知信号(正弦波或频率 modulated 连续波 FMCW)上进行测试,测试各种无关的并发音乐和语音。首先,我们评估了模型的性能,以显示生成的信号的质量。然后,我们在现实世界中对下游声波感知任务进行了现场研究。一个有12个用户的研究表明,使用我们自适应的信号进行呼吸监测和手势识别可以达到与没有同时播放音乐时的相同准确性,而截断或降维则表现出更差的准确性。此外,定性研究还表明,音乐播放质量没有下降,这与传统截断或降维方法不同。
URL
https://arxiv.org/abs/2403.10796