Paper Reading AI Learner

Physics-Aware Iterative Learning and Prediction of Saliency Map for Bimanual Grasp Planning

2024-04-13 09:39:20
Shiyao Wang, Xiuping Liu, Charlie C. L. Wang, Jian Liu

Abstract

Learning the skill of human bimanual grasping can extend the capabilities of robotic systems when grasping large or heavy objects. However, it requires a much larger search space for grasp points than single-hand grasping and numerous bimanual grasping annotations for network learning, making both data-driven or analytical grasping methods inefficient and insufficient. We propose a framework for bimanual grasp saliency learning that aims to predict the contact points for bimanual grasping based on existing human single-handed grasping data. We learn saliency corresponding vectors through minimal bimanual contact annotations that establishes correspondences between grasp positions of both hands, capable of eliminating the need for training a large-scale bimanual grasp dataset. The existing single-handed grasp saliency value serves as the initial value for bimanual grasp saliency, and we learn a saliency adjusted score that adds the initial value to obtain the final bimanual grasp saliency value, capable of predicting preferred bimanual grasp positions from single-handed grasp saliency. We also introduce a physics-balance loss function and a physics-aware refinement module that enables physical grasp balance, capable of enhancing the generalization of unknown objects. Comprehensive experiments in simulation and comparisons on dexterous grippers have demonstrated that our method can achieve balanced bimanual grasping effectively.

Abstract (translated)

学习人类双 manual抓取技能可以扩展机器人系统在抓取大型或重物时的能力。然而,它需要比单手抓取和大量的双手抓取注释更大的抓点搜索空间,使得数据驱动或分析性抓取方法低效和不足。我们提出了一个双手抓取局部注意力学习框架,旨在根据现有的人类单手抓取数据预测双手抓取的接触点。我们通过最小程度的双手抓取注释学习相应的局部重要性向量,建立了手部抓取位置之间的对应关系,能够消除训练大规模双手抓取数据集的需求。现有的单手抓取局部重要性值作为双手抓取局部重要性的初始值,我们学习了一个局部重要性调整得分,将初始值加起来以获得最终双手抓取局部重要性值,能够预测从单手抓取局部重要性中预测更喜欢的手部抓取位置。我们还引入了物理平衡损失函数和物理感知平滑模块,使得手部平衡得以实现,能够增强对未知物体的泛化能力。通过仿真实验和双手灵巧爪器的比较,我们的方法可以有效实现平衡双手抓取。

URL

https://arxiv.org/abs/2404.08944

PDF

https://arxiv.org/pdf/2404.08944.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot