Paper Reading AI Learner

Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning

2024-05-03 18:41:13
Dhruva Tirumala, Markus Wulfmeier, Ben Moran, Sandy Huang, Jan Humplik, Guy Lever, Tuomas Haarnoja, Leonard Hasenclever, Arunkumar Byravan, Nathan Batchelor, Neil Sreendra, Kushal Patel, Marlon Gwira, Francesco Nori, Martin Riedmiller, Nicolas Heess

Abstract

We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including active perception, agile full-body control, and long-horizon planning in a dynamic, partially-observable, multi-agent domain. We rely on large-scale, simulation-based data generation to obtain complex behaviors from egocentric vision which can be successfully transferred to physical robots using low-cost sensors. To achieve adequate visual realism, our simulation combines rigid-body physics with learned, realistic rendering via multiple Neural Radiance Fields (NeRFs). We combine teacher-based multi-agent RL and cross-experiment data reuse to enable the discovery of sophisticated soccer strategies. We analyze active-perception behaviors including object tracking and ball seeking that emerge when simply optimizing perception-agnostic soccer play. The agents display equivalent levels of performance and agility as policies with access to privileged, ground-truth state. To our knowledge, this paper constitutes a first demonstration of end-to-end training for multi-agent robot soccer, mapping raw pixel observations to joint-level actions, that can be deployed in the real world. Videos of the game-play and analyses can be seen on our website this https URL .

Abstract (translated)

我们将多智能体深度强化学习(RL)应用于训练具有完全车载计算和感知能力的端到端机器人足球策略,通过采用 ego 中心式 RGB 视觉。这个设置反映了现实世界机器人领域许多挑战,包括积极感知、灵活的全身体控制和动态、部分不可观测的多智能体领域的长距离规划。我们依赖于大规模、基于模拟的数据生成来获得自适应的 behaviors,这些 behaviors 可以成功地传输到物理机器人,利用低成本传感器。为了实现适当的视觉现实,我们的模拟结合了刚体物理和通过多个 Neural Radiance Fields (NeRFs) 学习到的逼真的渲染。我们将基于教师的多智能体 RL 和跨实验数据复用来探索复杂的足球策略。我们分析了在仅优化感知无关足球比赛时出现的积极感知行为,包括物体跟踪和寻找球。代理显示与具有特权、地面真实状态访问权限的政策具有同等的表现和敏捷性。据我们所知,本文是首次将端到端训练多智能体机器人足球的实践,将原始像素观察结果映射到关节级别动作,可以在现实世界中部署。游戏的视频和分析可以在我们的网站 https:// this URL 上查看。

URL

https://arxiv.org/abs/2405.02425

PDF

https://arxiv.org/pdf/2405.02425.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot