Paper Reading AI Learner

Signal Alignment for Humanoid Skeletons via the Globally Optimal Reparameterization Algorithm

2018-07-18 03:16:39
Thomas W. Mitchel, Sipu Ruan, Gregory S. Chirikjian

Abstract

The general ability to analyze and classify the 3D kinematics of the human form is an essential step in the development of socially adept humanoid robots. A variety of different types of signals can be used by machines to represent and characterize actions such as RGB videos, infrared maps, and optical flow. In particular, skeleton sequences provide a natural 3D kinematic description of human motions and can be acquired in real time using RGB+D cameras. Moreover, skeleton sequences are generalizable to characterize the motions of both humans and humanoid robots. The Globally Optimal Reparameterization Algorithm (GORA) is a novel, recently proposed algorithm for signal alignment in which signals are reparameterized to a globally optimal universal standard timescale (UST). Here, we introduce a variant of GORA for humanoid action recognition with skeleton sequences, which we call GORA-S. We briefly review the algorithm's mathematical foundations and contextualize them in the problem of action recognition with skeleton sequences. Subsequently, we introduce GORA-S and discuss parameters and numerical techniques for its effective implementation. We then compare its performance with that of the DTW and FastDTW algorithms, in terms of computational efficiency and accuracy in matching skeletons. Our results show that GORA-S attains a complexity that is significantly less than that of any tested DTW method. In addition, it displays a favorable balance between speed and accuracy that remains invariant under changes in skeleton sampling frequency, lending it a degree of versatility that could make it well-suited for a variety of action recognition tasks.

Abstract (translated)

分析和分类人体三维运动学的一般能力是社会娴熟的人形机器人发展的重要一步。机器可以使用各种不同类型的信号来表示和表征诸如RGB视频,红外图和光流之类的动作。特别地,骨架序列提供人类运动的自然3D运动学描述,并且可以使用RGB + D相机实时获取。此外,骨架序列可推广用于表征人类和类人机器人的运动。全局最优重新参数化算法(GORA)是最近提出的一种新颖的信号对准算法,其中信号被重新参数化为全局最优通用标准时标(UST)。在这里,我们介绍了GORA的变体,用于人体动作识别和骨架序列,我们称之为GORA-S。我们简要回顾一下算法的数学基础,并用骨架序列对动作识别问题进行语境化。随后,我们介绍了GORA-S并讨论了有效实施的参数和数值技术。然后,我们将其性能与DTW和FastDTW算法的性能进行比较,在计算效率和匹配骨架的准确性方面。我们的结果表明,GORA-S的复杂性远远低于任何测试的DTW方法。此外,它在速度和精度之间显示出良好的平衡,在骨架采样频率的变化下保持不变,使其具有一定程度的多功能性,使其非常适合各种动作识别任务。

URL

https://arxiv.org/abs/1807.07432

PDF

https://arxiv.org/pdf/1807.07432.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot