Paper Reading AI Learner

Planning for Complex Non-prehensile Manipulation Among Movable Objects by Interleaving Multi-Agent Pathfinding and Physics-Based Simulation

2023-03-23 15:29:27
Dhruv Mauria Saxena, Maxim Likhachev

Abstract

Real-world manipulation problems in heavy clutter require robots to reason about potential contacts with objects in the environment. We focus on pick-and-place style tasks to retrieve a target object from a shelf where some `movable' objects must be rearranged in order to solve the task. In particular, our motivation is to allow the robot to reason over and consider non-prehensile rearrangement actions that lead to complex robot-object and object-object interactions where multiple objects might be moved by the robot simultaneously, and objects might tilt, lean on each other, or topple. To support this, we query a physics-based simulator to forward simulate these interaction dynamics which makes action evaluation during planning computationally very expensive. To make the planner tractable, we establish a connection between the domain of Manipulation Among Movable Objects and Multi-Agent Pathfinding that lets us decompose the problem into two phases our M4M algorithm iterates over. First we solve a multi-agent planning problem that reasons about the configurations of movable objects but does not forward simulate a physics model. Next, an arm motion planning problem is solved that uses a physics-based simulator but does not search over possible configurations of movable objects. We run simulated and real-world experiments with the PR2 robot and compare against relevant baseline algorithms. Our results highlight that M4M generates complex 3D interactions, and solves at least twice as many problems as the baselines with competitive performance.

Abstract (translated)

在堆积如山的复杂物品中,实际的操作问题需要机器人考虑与环境中的对象的可能接触。我们专注于从拿起和放置任务中取出目标对象,在需要重构一些“可移动”对象才能够完成任务的货架上操作。特别是,我们的目标是让机器人考虑非接触性重构行动,导致复杂的机器人-对象和对象-对象相互作用,其中多个对象可能会被机器人同时移动,对象可能会倾斜、相互倚靠或倒塌。为了支持这一点,我们查询了一个基于物理的模拟器, forward simulate 这些相互作用的动态,这使得在规划期间计算代价很高。为了让规划变得可计算,我们建立了操纵可移动对象领域的连接,并将其与多Agent路径搜索领域联系起来,让我们可以将问题分解为两个阶段,我们的M4M算法迭代处理。我们首先解决了一个多Agent规划问题,它考虑可移动对象的配置但并未forward simulate一个物理模型。接下来,我们解决了一个手臂运动规划问题,它使用一个基于物理的模拟器但并未搜索可移动对象的可能配置。我们使用PR2机器人进行了模拟和现实世界实验,并与相关基准算法进行比较。我们的结果显示,M4M生成复杂的三维交互,并解决了至少与基准算法竞争性能的两倍数量的问题。

URL

https://arxiv.org/abs/2303.13352

PDF

https://arxiv.org/pdf/2303.13352.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot