Paper Reading AI Learner

Sport Task: Fine Grained Action Detection and Classification of Table Tennis Strokes from Videos for MediaEval 2022

2023-01-31 12:03:59
Pierre-Etienne Martin (MPI-EVA), Jordan Calandre (MIA), Boris Mansencal (LaBRI), Jenny Benois-Pineau (LaBRI), Renaud Péteri (MIA), Laurent Mascarilla (MIA), Julien Morlier

Abstract

Sports video analysis is a widespread research topic. Its applications are very diverse, like events detection during a match, video summary, or fine-grained movement analysis of athletes. As part of the MediaEval 2022 benchmarking initiative, this task aims at detecting and classifying subtle movements from sport videos. We focus on recordings of table tennis matches. Conducted since 2019, this task provides a classification challenge from untrimmed videos recorded under natural conditions with known temporal boundaries for each stroke. Since 2021, the task also provides a stroke detection challenge from unannotated, untrimmed videos. This year, the training, validation, and test sets are enhanced to ensure that all strokes are represented in each dataset. The dataset is now similar to the one used in [1, 2]. This research is intended to build tools for coaches and athletes who want to further evaluate their sport performances.

Abstract (translated)

体育视频分析是一个广泛研究的课题,其应用范围非常多样化,例如比赛期间的事件发生检测、视频摘要、或运动员精细的运动分析。作为MediaEval 2022基准倡议的一部分,本任务旨在从体育视频中检测和分类微妙运动。我们专注于乒乓球比赛的录制。自2019年以来,该任务从自然条件下未剪辑的视频中提供了每个击球时间边界已知的分类挑战。自2021年以来,该任务还提供了未标注的未剪辑视频的击球检测挑战。今年,训练、验证和测试集得到了增强,以确保每个数据集都涵盖了所有击球。数据集现在类似于[1,2]中使用的数据集。本研究旨在为教练和运动员构建工具,以进一步评估其体育表现。

URL

https://arxiv.org/abs/2301.13576

PDF

https://arxiv.org/pdf/2301.13576.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot