Paper Reading AI Learner

Integrated Object Deformation and Contact Patch Estimation from Visuo-Tactile Feedback

2023-05-23 18:53:24
Mark Van der Merwe, Youngsun Wi, Dmitry Berenson, Nima Fazeli

Abstract

Reasoning over the interplay between object deformation and force transmission through contact is central to the manipulation of compliant objects. In this paper, we propose Neural Deforming Contact Field (NDCF), a representation that jointly models object deformations and contact patches from visuo-tactile feedback using implicit representations. Representing the object geometry and contact with the environment implicitly allows a single model to predict contact patches of varying complexity. Additionally, learning geometry and contact simultaneously allows us to enforce physical priors, such as ensuring contacts lie on the surface of the object. We propose a neural network architecture to learn a NDCF, and train it using simulated data. We then demonstrate that the learned NDCF transfers directly to the real-world without the need for fine-tuning. We benchmark our proposed approach against a baseline representing geometry and contact patches with point clouds. We find that NDCF performs better on simulated data and in transfer to the real-world.

Abstract (translated)

对物体变形和接触力通过接触的交互作用进行推理是处理柔韧性物体操纵的核心。在本文中,我们提出了神经网络变形接触场(NDCF),一种通过隐含表示模型联合预测物体变形和接触点的方式。通过隐含表示表示物体几何和与环境的接触,允许一个模型预测不同复杂度的接触点。同时,同时学习几何和接触可以让我们强制物理先验,例如确保接触点在物体表面。我们提出了一种神经网络架构来学习NDCF,并使用模拟数据进行训练。然后我们证明, learned NDCF可以直接移植到现实世界,不需要进行微调。我们基准了我们提出的方法与一个以点云表示几何和接触点的基线相比。我们发现NDCF在模拟数据和移植到现实世界中表现更好。

URL

https://arxiv.org/abs/2305.14470

PDF

https://arxiv.org/pdf/2305.14470.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot