Paper Reading AI Learner

Table tennis ball spin estimation with an event camera

2024-04-15 15:36:38
Thomas Gossard, Julian Krismer, Andreas Ziegler, Jonas Tebbe, Andreas Zell

Abstract

Spin plays a pivotal role in ball-based sports. Estimating spin becomes a key skill due to its impact on the ball's trajectory and bouncing behavior. Spin cannot be observed directly, making it inherently challenging to estimate. In table tennis, the combination of high velocity and spin renders traditional low frame rate cameras inadequate for quickly and accurately observing the ball's logo to estimate the spin due to the motion blur. Event cameras do not suffer as much from motion blur, thanks to their high temporal resolution. Moreover, the sparse nature of the event stream solves communication bandwidth limitations many frame cameras face. To the best of our knowledge, we present the first method for table tennis spin estimation using an event camera. We use ordinal time surfaces to track the ball and then isolate the events generated by the logo on the ball. Optical flow is then estimated from the extracted events to infer the ball's spin. We achieved a spin magnitude mean error of $10.7 \pm 17.3$ rps and a spin axis mean error of $32.9 \pm 38.2°$ in real time for a flying ball.

Abstract (translated)

旋转在球类运动中扮演着关键角色。由于其对球轨迹和弹起行为的影响,估计旋转成为了一个关键技能。由于无法直接观察到旋转,因此估计旋转本质上具有挑战性。在乒乓球中,高速度和高旋转使得传统低帧率相机无法快速且准确地观察到球的标志,从而导致运动模糊。事件相机由于其高时间分辨率,没有像事件相机那样受到运动模糊的影响。此外,事件流稀疏的特性解决了许多帧相机面临的通信带宽限制。据我们所知,我们首先提出了一种使用事件相机进行乒乓球旋转估计的方法。我们使用序时表面跟踪球,然后从球上估计标志的事件。然后通过提取这些事件估计球的旋转。我们可以在实时飞行球中实现球旋转 magnitude 平均误差为 $10.7 \pm 17.3$ rps 和轴旋转平均误差为 $32.9 \pm 38.2^\circ$。

URL

https://arxiv.org/abs/2404.09870

PDF

https://arxiv.org/pdf/2404.09870.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot