Paper Reading AI Learner

Design of conversational humanoid robot based on hardware independent gesture generation

2019-05-21 15:38:14
katsushi ikeuchi, David Baumert, Shunsuke Kudoh, Masaru Takizawa

Abstract

With an increasing need for elderly and disability care, there is an increasing opportunity for intelligent and mobile devices such as robots to provide care and support solutions. In order to naturally assist and interact with humans, a robot must possess effective conversational capabilities. Gestures accompanying spoken sentences are an important factor in human-to-human conversational communication. Humanoid robots must also use gestures if they are to be capable of the rich interactions implied and afforded by their humanlike appearance. However, present systems for gesture generation do not dynamically provide realistic physical gestures that are naturally understood by humans. A method for humanoid robots to generate gestures along with spoken sentences is proposed herein. We emphasize that our gesture-generating architecture can be applied to any type of humanoid robot through the use of labanotation, which is an existing system for notating human dance movements. Labanotation's gesture symbols can computationally transformed to be compatible across a range of robots with doddering physical characteristics. This paper describes a solution as an integrated system for conversational robots whose speech and gestures can supplement each other in human-robot interaction.

Abstract (translated)

随着对老年人和残疾人护理需求的不断增加,智能和移动设备(如机器人)提供护理和支持解决方案的机会也越来越大。为了自然地帮助和与人类互动,机器人必须具备有效的会话能力。口语中的手势是人与人对话的一个重要因素。类人机器人如果要能够通过其类人的外表进行暗示和提供丰富的互动,也必须使用手势。然而,现有的手势生成系统并不能动态地提供人类自然理解的真实的身体手势。本文提出了一种仿人机器人生成手势和口语的方法。我们强调,我们的手势生成架构可以通过使用Labanotation应用于任何类型的类人机器人,Labanotation是一个现有的用于记录人类舞蹈动作的系统。Labanotation的手势符号可以通过计算来转换,以便在一系列具有复杂物理特性的机器人中兼容。本文提出了一种对话机器人的整体解决方案,该系统的语言和手势可以在人机交互中相互补充。

URL

https://arxiv.org/abs/1905.08702

PDF

https://arxiv.org/pdf/1905.08702.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot