Paper Reading AI Learner

Multi-modal Feedback for Affordance-driven Interactive Reinforcement Learning

2018-07-26 07:48:33
Francisco Cruz, German I. Parisi, Stefan Wermter

Abstract

Interactive reinforcement learning (IRL) extends traditional reinforcement learning (RL) by allowing an agent to interact with parent-like trainers during a task. In this paper, we present an IRL approach using dynamic audio-visual input in terms of vocal commands and hand gestures as feedback. Our architecture integrates multi-modal information to provide robust commands from multiple sensory cues along with a confidence value indicating the trustworthiness of the feedback. The integration process also considers the case in which the two modalities convey incongruent information. Additionally, we modulate the influence of sensory-driven feedback in the IRL task using goal-oriented knowledge in terms of contextual affordances. We implement a neural network architecture to predict the effect of performed actions with different objects to avoid failed-states, i.e., states from which it is not possible to accomplish the task. In our experimental setup, we explore the interplay of multimodal feedback and task-specific affordances in a robot cleaning scenario. We compare the learning performance of the agent under four different conditions: traditional RL, multi-modal IRL, and each of these two setups with the use of contextual affordances. Our experiments show that the best performance is obtained by using audio-visual feedback with affordancemodulated IRL. The obtained results demonstrate the importance of multi-modal sensory processing integrated with goal-oriented knowledge in IRL tasks.

Abstract (translated)

交互式强化学习(IRL)通过允许代理在任务期间与父母类培训师交互来扩展传统强化学习(RL)。在本文中,我们提出了一种使用动态视听输入的IRL方法,在声乐命令和手势方面作为反馈。我们的架构集成了多模态信息,可提供来自多个感知线索的强大命令,以及表示反馈可信度的置信度值。整合过程还考虑了两种方式传达不一致信息的情况。此外,我们使用面向目标的知识来调节感官驱动的反馈在IRL任务中的影响。我们实现神经网络体系结构来预测用不同对象执行的动作的效果,以避免失败状态,即,不可能完成任务的状态。在我们的实验设置中,我们探索了机器人清洁场景中多模态反馈和任务特定功能的相互作用。我们在四种不同的条件下比较了代理的学习性能:传统的RL,多模式IRL,以及这两种设置中的每一种都使用了上下文可供性。我们的实验表明,通过使用具有可供调节的IRL的视听反馈获得最佳性能。获得的结果证明了多模态感知处理在IRL任务中与目标导向知识相结合的重要性。

URL

https://arxiv.org/abs/1807.09991

PDF

https://arxiv.org/pdf/1807.09991.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot