Paper Reading AI Learner

Inductive Learning of Robot Task Knowledge from Raw Data and Online Expert Feedback

2025-01-13 17:25:46
Daniele Meli, Paolo Fiorini

Abstract

The increasing level of autonomy of robots poses challenges of trust and social acceptance, especially in human-robot interaction scenarios. This requires an interpretable implementation of robotic cognitive capabilities, possibly based on formal methods as logics for the definition of task specifications. However, prior knowledge is often unavailable in complex realistic scenarios. In this paper, we propose an offline algorithm based on inductive logic programming from noisy examples to extract task specifications (i.e., action preconditions, constraints and effects) directly from raw data of few heterogeneous (i.e., not repetitive) robotic executions. Our algorithm leverages on the output of any unsupervised action identification algorithm from video-kinematic recordings. Combining it with the definition of very basic, almost task-agnostic, commonsense concepts about the environment, which contribute to the interpretability of our methodology, we are able to learn logical axioms encoding preconditions of actions, as well as their effects in the event calculus paradigm. Since the quality of learned specifications depends mainly on the accuracy of the action identification algorithm, we also propose an online framework for incremental refinement of task knowledge from user feedback, guaranteeing safe execution. Results in a standard manipulation task and benchmark for user training in the safety-critical surgical robotic scenario, show the robustness, data- and time-efficiency of our methodology, with promising results towards the scalability in more complex domains.

Abstract (translated)

机器人自主性的提升带来了信任和社交接受度方面的挑战,尤其是在人机交互场景中。这要求一种可解释的机器人认知能力实现方式,可能基于形式化方法如逻辑来定义任务规格说明。然而,在复杂的现实场景中,先验知识往往不可用。在本文中,我们提出了一种基于归纳逻辑编程的离线算法,该算法可以从少量异构(即不重复)机器人执行的原始数据中的噪声示例中提取任务规格说明(例如,动作前提条件、约束和效果)。我们的算法利用任何无监督的动作识别算法从视频-运动记录产生的输出。结合环境的基本常识概念定义,这些概念几乎是与具体任务无关的,并有助于我们方法的可解释性,我们能够学习编码行动先决条件以及在事件演算范式中的效果的逻辑公理。由于所学规格的质量主要取决于动作识别算法的准确性,我们也提出了一种在线框架,用于从用户反馈中增量地细化任务知识,以保证安全执行。在一个标准的手动操作任务和针对安全关键手术机器人场景中用户的培训基准测试的结果表明,我们的方法具有鲁棒性、数据效率和时间效率,并且在更复杂的领域扩展方面也显示出有前景的结果。

URL

https://arxiv.org/abs/2501.07507

PDF

https://arxiv.org/pdf/2501.07507.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot