Paper Reading AI Learner

Fingertip Detection and Tracking for Recognition of Air-Writing in Videos

2018-09-09 18:10:59
Sohom Mukherjee, Arif Ahmed, Debi Prosad Dogra, Samarjit Kar, Partha Pratim Roy

Abstract

Air-writing is the process of writing characters or words in free space using finger or hand movements without the aid of any hand-held device. In this work, we address the problem of mid-air finger writing using web-cam video as input. In spite of recent advances in object detection and tracking, accurate and robust detection and tracking of the fingertip remains a challenging task, primarily due to small dimension of the fingertip. Moreover, the initialization and termination of mid-air finger writing is also challenging due to the absence of any standard delimiting criterion. To solve these problems, we propose a new writing hand pose detection algorithm for initialization of air-writing using the Faster R-CNN framework for accurate hand detection followed by hand segmentation and finally counting the number of raised fingers based on geometrical properties of the hand. Further, we propose a robust fingertip detection and tracking approach using a new signature function called distance-weighted curvature entropy. Finally, a fingertip velocity-based termination criterion is used as a delimiter to mark the completion of the air-writing gesture. Experiments show the superiority of the proposed fingertip detection and tracking algorithm over state-of-the-art approaches giving a mean precision of 73.1 % while achieving real-time performance at 18.5 fps, a condition which is of vital importance to air-writing. Character recognition experiments give a mean accuracy of 96.11 % using the proposed air-writing system, a result which is comparable to that of existing handwritten character recognition systems.

Abstract (translated)

空气写入是在没有任何手持设备的帮助下使用手指或手部动作在自由空间中书写字符或单词的过程。在这项工作中,我们解决了使用网络摄像头视频作为输入的空中手指书写问题。尽管最近在物体检测和跟踪方面取得了进展,但指尖的准确和稳健的检测和跟踪仍然是一项具有挑战性的任务,主要是由于指尖的尺寸较小。此外,由于没有任何标准划界标准,空中指纹的初始化和终止也具有挑战性。为了解决这些问题,我们提出了一种新的书写手姿态检测算法,用于使用更快的R-CNN框架进行空气写入初始化,以便进行精确的手部检测,然后进行手部分割,最后根据手的几何特性计算凸起手指的数量。 。此外,我们提出了一种强大的指尖检测和跟踪方法,使用称为距离加权曲率熵的新签名函数。最后,基于指尖速度的终止标准被用作分隔符以标记气写手势的完成。实验表明,所提出的指尖检测和跟踪算法优于现有技术方法,其平均精度为73.1%,同时实现18.5 fps的实时性能,这对空气写入至关重要。使用所提出的空气书写系统,字符识别实验的平均准确度为96.11%,该结果与现有的手写字符识别系统相当。

URL

https://arxiv.org/abs/1809.03016

PDF

https://arxiv.org/pdf/1809.03016.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot