Paper Reading AI Learner

Learning Vision-based Robotic Manipulation Tasks Sequentially in Offline Reinforcement Learning Settings

2023-01-31 07:06:03
Sudhir Pratap Yadav, Rajendra Nagar, Suril V. Shah

Abstract

With the rise of deep reinforcement learning (RL) methods, many complex robotic manipulation tasks are being solved. However, harnessing the full power of deep learning requires large datasets. Online-RL does not suit itself readily into this paradigm due to costly and time-taking agent environment interaction. Therefore recently, many offline-RL algorithms have been proposed to learn robotic tasks. But mainly, all such methods focus on a single task or multi-task learning, which requires retraining every time we need to learn a new task. Continuously learning tasks without forgetting previous knowledge combined with the power of offline deep-RL would allow us to scale the number of tasks by keep adding them one-after-another. In this paper, we investigate the effectiveness of regularisation-based methods like synaptic intelligence for sequentially learning image-based robotic manipulation tasks in an offline-RL setup. We evaluate the performance of this combined framework against common challenges of sequential learning: catastrophic forgetting and forward knowledge transfer. We performed experiments with different task combinations to analyze the effect of task ordering. We also investigated the effect of the number of object configurations and density of robot trajectories. We found that learning tasks sequentially helps in the propagation of knowledge from previous tasks, thereby reducing the time required to learn a new task. Regularisation based approaches for continuous learning like the synaptic intelligence method although helps in mitigating catastrophic forgetting but has shown only limited transfer of knowledge from previous tasks.

Abstract (translated)

随着深度学习方法的兴起,许多复杂的机器人操纵任务正在被解决。然而,利用深度学习的全部潜力需要大型数据集。在线强化学习并不适合这种模式,因为它的成本和耗时的代理环境交互。因此,最近许多离线强化学习算法被提议用于学习机器人任务。但 mainly,所有这些方法都专注于一个任务或多任务学习,这需要我们每次需要学习新任务进行重新训练。持续学习任务而不会忘记先前知识,结合离线深度学习的力量,可以让我们不断扩展任务数量,只需不断地添加它们。在本文中,我们研究了一种基于 Regularization 的方法,例如前馈神经网络智能,用于顺序学习基于图像的机器人操纵任务。我们评估了这联合框架的性能,以对抗顺序学习的常见挑战:灾难性遗忘和向前知识转移。我们进行了实验,使用不同的任务组合来分析任务排序的影响。我们还研究了对象配置数量和机器人路径密度的影响。我们发现,学习任务顺序有助于从先前任务中传播知识,从而减少了学习新任务所需的时间。类似于前馈神经网络智能的方法,连续学习的方法,虽然有助于减轻灾难性遗忘,但仅展示了从先前任务中有限转移知识的情况。

URL

https://arxiv.org/abs/2301.13450

PDF

https://arxiv.org/pdf/2301.13450.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot