Paper Reading AI Learner

Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics

2025-04-04 07:11:12
Jungpil Shin, Abu Saleh Musa Miah, Sota Konnai, Shu Hoshitaka, Pankoo Kim

Abstract

Hand gesture recognition using multichannel surface electromyography (sEMG) is challenging due to unstable predictions and inefficient time-varying feature enhancement. To overcome the lack of signal based time-varying feature problems, we propose a lightweight squeeze-excitation deep learning-based multi stream spatial temporal dynamics time-varying feature extraction approach to build an effective sEMG-based hand gesture recognition system. Each branch of the proposed model was designed to extract hierarchical features, capturing both global and detailed spatial-temporal relationships to ensure feature effectiveness. The first branch, utilizing a Bidirectional-TCN (Bi-TCN), focuses on capturing long-term temporal dependencies by modelling past and future temporal contexts, providing a holistic view of gesture dynamics. The second branch, incorporating a 1D Convolutional layer, separable CNN, and Squeeze-and-Excitation (SE) block, efficiently extracts spatial-temporal features while emphasizing critical feature channels, enhancing feature relevance. The third branch, combining a Temporal Convolutional Network (TCN) and Bidirectional LSTM (BiLSTM), captures bidirectional temporal relationships and time-varying patterns. Outputs from all branches are fused using concatenation to capture subtle variations in the data and then refined with a channel attention module, selectively focusing on the most informative features while improving computational efficiency. The proposed model was tested on the Ninapro DB2, DB4, and DB5 datasets, achieving accuracy rates of 96.41%, 92.40%, and 93.34%, respectively. These results demonstrate the capability of the system to handle complex sEMG dynamics, offering advancements in prosthetic limb control and human-machine interface technologies with significant implications for assistive technologies.

Abstract (translated)

基于多通道表面肌电图(sEMG)的手势识别由于预测不稳定和时间变化特征增强效率低而具有挑战性。为了解决信号基的时间变化特征问题,我们提出了一种轻量级的挤压激励深度学习多流时空动态时变特征提取方法,以构建有效的基于sEMG的手势识别系统。所提出的模型每个分支都设计用于提取分层特征,捕捉全局和详细的时空关系,确保特征的有效性。 第一个分支利用双向TCN(Bi-TCN),专注于通过建模过去的未来的时间上下文来捕获长期时间依赖关系,从而提供手势动态的全面视图。 第二个分支结合了一维卷积层、可分离CNN和挤压激励(SE)块,高效地提取时空特征并强调关键特征通道,增强特征的相关性。 第三个分支则结合了时序卷积网络(TCN)和双向LSTM(BiLSTM),捕捉双向时间关系和时间变化模式。 所有分支的输出通过连接融合来捕获数据中的细微变化,并使用信道注意模块进行细化,该模块选择性地关注最相关的特征并提高计算效率。所提出的模型在Ninapro DB2、DB4和DB5数据集上进行了测试,在这三个数据集中分别达到了96.41%、92.40% 和 93.34%的准确率。 这些结果展示了该系统能够处理复杂的sEMG动态,为假肢控制和人机接口技术的发展提供重要进展,并对辅助技术领域具有重要意义。

URL

https://arxiv.org/abs/2504.03221

PDF

https://arxiv.org/pdf/2504.03221.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot