Paper Reading AI Learner

Agile and versatile bipedal robot tracking control through reinforcement learning

2024-04-12 05:25:03
Jiayi Li, Linqi Ye, Yi Cheng, Houde Liu, Bin Liang

Abstract

The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To replicate this balance ability and body agility, this paper proposes a versatile controller for bipedal robots. This controller achieves ankle and body trajectory tracking across a wide range of gaits using a single small-scale neural network, which is based on a model-based IK solver and reinforcement learning. We consider a single step as the smallest control unit and design a universally applicable control input form suitable for any single-step variation. Highly flexible gait control can be achieved by combining these minimal control units with high-level policy through our extensible control interface. To enhance the trajectory-tracking capability of our controller, we utilize a three-stage training curriculum. After training, the robot can move freely between target footholds at varying distances and heights. The robot can also maintain static balance without repeated stepping to adjust posture. Finally, we evaluate the tracking accuracy of our controller on various bipedal tasks, and the effectiveness of our control framework is verified in the simulation environment.

Abstract (translated)

人类在复杂动态运动中如舞蹈和体操等表现出的非凡运动智能表明,生物体内的平衡机制与特定的运动模式解耦。这种解耦允许在某些约束条件下执行学习和未学习的运动,并通过轻微的身体整体协调来维持平衡。为了复制这种平衡能力和身体敏捷性,本文提出了一种多足机器人平衡控制器。该控制器使用基于模型的IK求解器和强化学习来实现踝部和身体轨迹跟踪。我们将其单个一步视为最小的控制单元,并设计了一种适合任何单步变化的通用控制输入形式。通过将这些最小控制单元与高层次策略相结合,可以实现高度灵活的步态控制。为了增强我们控制器的轨迹跟踪能力,我们采用了一种三级训练课程。训练后,机器人可以在距离和高度不同的目标脚爪之间自由移动。机器人还可以在没有重复迈步来调整姿态的情况下保持静态平衡。最后,我们在各种多足任务上评估了我们的控制器的跟踪准确性,并在模拟环境中验证了我们的控制框架的有效性。

URL

https://arxiv.org/abs/2404.08246

PDF

https://arxiv.org/pdf/2404.08246.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot