Abstract
Hand gesture detection is a well-explored area in computer vision with applications in various forms of Human-Computer Interactions. In this work, we propose a technique for simultaneous hand gesture classification, handedness detection, and hand keypoints localization using thermal data captured by an infrared camera. Our method uses a novel deep multi-task learning architecture that includes shared encoderdecoder layers followed by three branches dedicated for each mentioned task. We performed extensive experimental validation of our model on an in-house dataset consisting of 24 users data. The results confirm higher than 98 percent accuracy for gesture classification, handedness detection, and fingertips localization, and more than 91 percent accuracy for wrist points localization.
Abstract (translated)
手动作检测是计算机视觉领域中一个已经被广泛探索的领域,其应用涵盖了各种人机交互形式。在本研究中,我们提出了一种利用红外摄像头捕获的 thermal 数据,同时实现手动作分类、手性识别和手关键点定位的技术。我们的算法采用了一种独特的深度多任务学习架构,其中包括共享编码解码层,然后为每个任务分配三个分支。我们在一个包含24个用户数据的公司内部数据集上进行了大量实验验证,结果表明,手势分类、手性识别和指代词点定位的准确率高于98%, wrist 点定位的准确率超过91%。
URL
https://arxiv.org/abs/2303.01547