Paper Reading AI Learner

Visual motion analysis of the player's finger

2023-02-24 10:14:13
Marco Costanzo

Abstract

This work is about the extraction of the motion of fingers, in their three articulations, of a keyboard player from a video sequence. The relevance of the problem involves several aspects, in fact, the extraction of the movements of the fingers may be used to compute the keystroke efficiency and individual joint contributions, as showed by Werner Goebl and Caroline Palmer in the paper 'Temporal Control and Hand Movement Efficiency in Skilled Music Performance'. Those measures are directly related to the precision in timing and force measures. A very good approach to the hand gesture recognition problem has been presented in the paper ' Real-Time Hand Gesture Recognition Using Finger Segmentation'. Detecting the keys pressed on a keyboard is a task that can be complex because of the shadows that can degrade the quality of the result and possibly cause the detection of not pressed keys. Among the several approaches that already exist, a great amount of them is based on the subtraction of frames in order to detect the movements of the keys caused by their pressure. Detecting the keys that are pressed could be useful to automatically evaluate the performance of a pianist or to automatically write sheet music of the melody that is being played.

Abstract (translated)

这项工作是关于从视频序列中提取键盘演奏者手部三个关节的运动。这个问题涉及到多个方面,实际上,提取手部运动可能用于计算键击效率和个人联合贡献,如 Werner Goebl和Caroline Palmer 在论文《时间控制和手部运动效率在技能音乐表演中》中展示的那样。这些措施直接与时间精度和压力测量精度相关。一篇名为《利用手指分割实现实时手部姿态识别》的论文提出了一种非常好的手部姿态识别方法。检测键盘上按下的键是一个复杂的任务,因为 shadows 可能会降低结果的质量并可能导致未按下键的检测。在已经存在的多种方法中,大量方法基于减法来检测由压力引起的键的运动。检测按下的键可能有助于自动评估钢琴家的表现或自动编写正在演奏的旋律的乐谱。

URL

https://arxiv.org/abs/2303.12697

PDF

https://arxiv.org/pdf/2303.12697.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot