Paper Reading AI Learner

Talking Tennis: Language Feedback from 3D Biomechanical Action Recognition

2025-10-04 19:55:30
Arushi Dashore, Aryan Anumala, Emily Hui, Olivia Yang

Abstract

Automated tennis stroke analysis has advanced significantly with the integration of biomechanical motion cues alongside deep learning techniques, enhancing stroke classification accuracy and player performance evaluation. Despite these advancements, existing systems often fail to connect biomechanical insights with actionable language feedback that is both accessible and meaningful to players and coaches. This research project addresses this gap by developing a novel framework that extracts key biomechanical features (such as joint angles, limb velocities, and kinetic chain patterns) from motion data using Convolutional Neural Network Long Short-Term Memory (CNN-LSTM)-based models. These features are analyzed for relationships influencing stroke effectiveness and injury risk, forming the basis for feedback generation using large language models (LLMs). Leveraging the THETIS dataset and feature extraction techniques, our approach aims to produce feedback that is technically accurate, biomechanically grounded, and actionable for end-users. The experimental setup evaluates this framework on classification performance and interpretability, bridging the gap between explainable AI and sports biomechanics.

Abstract (translated)

自动化网球击球分析通过结合生物力学动作线索和深度学习技术得到了显著的提升,从而提高了击球分类准确性和球员表现评估。尽管取得了这些进展,现有系统往往未能将生物力学洞察与对玩家和教练来说既易于理解又具有实际意义的语言反馈相结合。本研究项目旨在填补这一空白,开发了一个新型框架,该框架使用基于卷积神经网络长短期记忆模型(CNN-LSTM)从运动数据中提取关键的生物力学特征(如关节角度、肢体速度及动力链模式)。这些特征被分析以确定影响击球效果和受伤风险的关系,并以此为基础利用大型语言模型生成反馈。通过利用THETIS数据集和特征提取技术,我们的方法旨在为最终用户提供既在技术上准确又基于生物力学且可操作的反馈。实验设置评估了该框架的分类性能及解释性,从而弥合了可解释人工智能与运动生物力学之间的差距。

URL

https://arxiv.org/abs/2510.03921

PDF

https://arxiv.org/pdf/2510.03921.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot