Paper Reading AI Learner

Improving Tactile Gesture Recognition with Optical Flow

2025-08-06 11:33:21
Shaohong Zhong, Alessandro Albini, Giammarco Caroleo, Giorgio Cannata, Perla Maiolino

Abstract

Tactile gesture recognition systems play a crucial role in Human-Robot Interaction (HRI) by enabling intuitive communication between humans and robots. The literature mainly addresses this problem by applying machine learning techniques to classify sequences of tactile images encoding the pressure distribution generated when executing the gestures. However, some gestures can be hard to differentiate based on the information provided by tactile images alone. In this paper, we present a simple yet effective way to improve the accuracy of a gesture recognition classifier. Our approach focuses solely on processing the tactile images used as input by the classifier. In particular, we propose to explicitly highlight the dynamics of the contact in the tactile image by computing the dense optical flow. This additional information makes it easier to distinguish between gestures that produce similar tactile images but exhibit different contact dynamics. We validate the proposed approach in a tactile gesture recognition task, showing that a classifier trained on tactile images augmented with optical flow information achieved a 9% improvement in gesture classification accuracy compared to one trained on standard tactile images.

Abstract (translated)

触觉手势识别系统在人机交互(HRI)中扮演着至关重要的角色,通过实现人类与机器人之间的直观交流来发挥作用。现有文献主要通过应用机器学习技术来分类编码手势执行过程中产生的压力分布的触觉图像序列,以此解决这一问题。然而,仅凭触觉图像是难以区分某些手势的。在这篇论文中,我们提出了一种简单而有效的方法来提高手势识别分类器的准确性。我们的方法专注于处理分类器使用的输入触觉图像。具体来说,我们建议通过计算密集光流来明确突出触觉图像中的接触动态,这将有助于区分那些虽然生成相似触觉图但表现出不同接触动态的手势。我们在一个触觉手势识别任务中验证了所提出的方法,并展示了基于光学流量信息增强的触觉图像训练得到的分类器在手势分类准确性方面相较于仅使用标准触觉图像训练提高了9%。

URL

https://arxiv.org/abs/2508.04338

PDF

https://arxiv.org/pdf/2508.04338.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot