Paper Reading AI Learner

Behavior Imitation for Manipulator Control and Grasping with Deep Reinforcement Learning

2024-05-02 13:43:22
Liu Qiyuan

Abstract

The existing Motion Imitation models typically require expert data obtained through MoCap devices, but the vast amount of training data needed is difficult to acquire, necessitating substantial investments of financial resources, manpower, and time. This project combines 3D human pose estimation with reinforcement learning, proposing a novel model that simplifies Motion Imitation into a prediction problem of joint angle values in reinforcement learning. This significantly reduces the reliance on vast amounts of training data, enabling the agent to learn an imitation policy from just a few seconds of video and exhibit strong generalization capabilities. It can quickly apply the learned policy to imitate human arm motions in unfamiliar videos. The model first extracts skeletal motions of human arms from a given video using 3D human pose estimation. These extracted arm motions are then morphologically retargeted onto a robotic manipulator. Subsequently, the retargeted motions are used to generate reference motions. Finally, these reference motions are used to formulate a reinforcement learning problem, enabling the agent to learn a policy for imitating human arm motions. This project excels at imitation tasks and demonstrates robust transferability, accurately imitating human arm motions from other unfamiliar videos. This project provides a lightweight, convenient, efficient, and accurate Motion Imitation model. While simplifying the complex process of Motion Imitation, it achieves notably outstanding performance.

Abstract (translated)

现有的运动模仿模型通常需要通过MoCap设备获得的专家数据,但需要的训练数据量巨大,很难获得,这需要大量的时间和财务资源。本项目将3D人体姿态估计与强化学习相结合,提出了一种将运动模仿简化为强化学习中关节角度预测问题的全新模型。这使得对大量训练数据的依赖程度显著降低,使得代理可以从几秒钟的视频中仅学习几个关节的模仿策略,并表现出强大的泛化能力。它能够快速将学习到的策略应用于不熟悉的视频中的模仿人类手臂运动。首先,使用3D人体姿态估计从给定的视频中提取人体的骨骼运动。然后,这些提取的运动动作被拓扑重构到机器人操作器上。接下来,重构的运动动作用于生成参考动作。最后,这些参考动作被用于构成强化学习问题,使得代理能够学习模仿人类手臂运动的策略。本项目在模仿任务中表现优异,并展示了稳健的泛化能力,准确地将不熟悉的视频中的人类手臂运动模仿出来。本项目提供了一个轻量、方便、高效和准确的动态模仿模型。尽管简化了运动模仿的复杂过程,但取得了显著的优异性能。

URL

https://arxiv.org/abs/2405.01284

PDF

https://arxiv.org/pdf/2405.01284.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot