Paper Reading AI Learner

Planning for Manipulation among Movable Objects: Deciding Which Objects Go Where, in What Order, and How

2023-03-23 15:55:19
Dhruv Saxena, Maxim Likhachev

Abstract

We are interested in pick-and-place style robot manipulation tasks in cluttered and confined 3D workspaces among movable objects that may be rearranged by the robot and may slide, tilt, lean or topple. A recently proposed algorithm, M4M, determines which objects need to be moved and where by solving a Multi-Agent Pathfinding MAPF abstraction of this problem. It then utilises a nonprehensile push planner to compute actions for how the robot might realise these rearrangements and a rigid body physics simulator to check whether the actions satisfy physics constraints encoded in the problem. However, M4M greedily commits to valid pushes found during planning, and does not reason about orderings over pushes if multiple objects need to be rearranged. Furthermore, M4M does not reason about other possible MAPF solutions that lead to different rearrangements and pushes. In this paper, we extend M4M and present Enhanced-M4M (E-M4M) -- a systematic graph search-based solver that searches over orderings of pushes for movable objects that need to be rearranged and different possible rearrangements of the scene. We introduce several algorithmic optimisations to circumvent the increased computational complexity, discuss the space of problems solvable by E-M4M and show that experimentally, both on the real robot and in simulation, it significantly outperforms the original M4M algorithm, as well as other state-of-the-art alternatives when dealing with complex scenes.

Abstract (translated)

我们感兴趣的是,在繁忙且狭窄的三维工作空间中,由机器人移动可移动对象组成的捉拿放置式机器人操作任务。最近提出的一种算法M4M,通过解决这个问题的多项式路径搜索MAPF抽象,确定需要移动哪些对象以及它们在哪些位置移动。然后使用非手触推进规划器计算机器人可能实现这些重构的方式,并使用Rigidbody物理模拟器检查这些行动是否满足问题中所编码的物理约束。然而,M4M贪婪地 commitment 到规划期间找到的合法推进,如果多个对象需要重构,则不会考虑推进的顺序。此外,M4M 不会考虑其他可能的解决方案,导致不同的重构和推进。在本文中,我们扩展了 M4M 并提出了增强型 M4M(E-M4M),它是一种基于系统 graph 搜索的求解器,搜索可移动对象需要重构的顺序以及场景的不同重构方案。我们介绍了几个算法优化,以绕过增加的计算复杂性,并讨论了 E-M4M 可以解决的问题的范围,并证明在实验中,无论在真实机器人上还是在模拟中,E-M4M 在处理复杂场景时显著优于原始 M4M 算法和其他先进的替代品。

URL

https://arxiv.org/abs/2303.13385

PDF

https://arxiv.org/pdf/2303.13385.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot