Abstract
Millimeter wave radar is gaining traction recently as a promising modality for enabling pervasive and privacy-preserving gesture recognition. However, the lack of rich and fine-grained radar datasets hinders progress in developing generalized deep learning models for gesture recognition across various user postures (e.g., standing, sitting), positions, and scenes. To remedy this, we resort to designing a software pipeline that exploits wealthy 2D videos to generate realistic radar data, but it needs to address the challenge of simulating diversified and fine-grained reflection properties of user gestures. To this end, we design G3R with three key components: (i) a gesture reflection point generator expands the arm's skeleton points to form human reflection points; (ii) a signal simulation model simulates the multipath reflection and attenuation of radar signals to output the human intensity map; (iii) an encoder-decoder model combines a sampling module and a fitting module to address the differences in number and distribution of points between generated and real-world radar data for generating realistic radar data. We implement and evaluate G3R using 2D videos from public data sources and self-collected real-world radar data, demonstrating its superiority over other state-of-the-art approaches for gesture recognition.
Abstract (translated)
毫米波雷达最近作为实现普遍且隐私保护的手势识别的有前景的模态而受到关注。然而,缺乏丰富和细粒度的雷达数据集会阻碍开发通用的深度学习模型用于各种用户姿势(例如,站立,坐着)和场景的手势识别。为了解决这个问题,我们采用了设计一个利用丰富 2D 视频生成逼真雷达数据的软件流水线,但需要解决模拟用户手势多种反射特性的挑战。为此,我们设计了 G3R,包括三个关键组件:(i)一个手势反射点生成器将手臂骨点扩展为人体反射点;(ii)一个信号模拟模型模拟多径反射和衰减雷达信号以输出人体强度图;(iii)一个编码器-解码器模型结合采样模块和拟合模块来解决生成和现实世界雷达数据中点之间的数量和分布差异,以生成逼真的雷达数据。我们使用来自公共数据源的 2D 视频和自收集的实世界雷达数据来实施和评估 G3R,证明了其在手势识别方面的优越性。
URL
https://arxiv.org/abs/2404.14934