Paper Reading AI Learner

TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction

2024-05-16 17:59:07
Yunfan Jiang, Chen Wang, Ruohan Zhang, Jiajun Wu, Li Fei-Fei

Abstract

Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge a priori. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. TRANSIC allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, TRANSIC is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort. Videos and code are available at this https URL

Abstract (translated)

在模拟中学习和将学到的策略应用于现实世界具有实现通用机器人的潜力。这种方法的关键挑战是解决模拟与现实之间的差距(sim-to-real gaps)。之前的方法通常需要先验的知识领域特定知识。我们认为,获得这种知识的最直接方法是让人类在现实生活中观察和辅助机器人策略执行。机器人可以从人类那里学习以填补各种模拟与现实之间的差距。我们提出了TRANSIC,一种基于人类在环框架的数据驱动方法,以实现基于人类在环的模拟与现实之间的成功转移。TRANSIC允许人类通过干预和在线纠错来通过各种未建模的模拟与现实之间的差距来扩展模拟策略。残余策略可以从人类的纠正中学习,并将其与模拟策略集成以实现自主执行。我们证明了,在我们的方法下,可以实现成功的模拟与现实之间的转移,特别是在复杂的接触操作任务中,如家具组装。通过模拟策略和学习人类策略的协同作用,TRANSIC是一种有效的全面方法来解决各种经常存在的模拟与现实之间的差距。它具有可扩展 human effort 的特点。视频和代码可以通过这个链接https://www.youtube.com/watch?v=获取:

URL

https://arxiv.org/abs/2405.10315

PDF

https://arxiv.org/pdf/2405.10315.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot