Paper Reading AI Learner

EyeSight Hand: Design of a Fully-Actuated Dexterous Robot Hand with Integrated Vision-Based Tactile Sensors and Compliant Actuation

2024-08-12 16:24:21
Branden Romero, Hao-Shu Fang, Pulkit Agrawal, Edward Adelson

Abstract

In this work, we introduce the EyeSight Hand, a novel 7 degrees of freedom (DoF) humanoid hand featuring integrated vision-based tactile sensors tailored for enhanced whole-hand manipulation. Additionally, we introduce an actuation scheme centered around quasi-direct drive actuation to achieve human-like strength and speed while ensuring robustness for large-scale data collection. We evaluate the EyeSight Hand on three challenging tasks: bottle opening, plasticine cutting, and plate pick and place, which require a blend of complex manipulation, tool use, and precise force application. Imitation learning models trained on these tasks, with a novel vision dropout strategy, showcase the benefits of tactile feedback in enhancing task success rates. Our results reveal that the integration of tactile sensing dramatically improves task performance, underscoring the critical role of tactile information in dexterous manipulation.

Abstract (translated)

在这项工作中,我们引入了EyeSight Hand,这是一种新型的具有7个自由度(DoF)的人形手,其集成式视觉传感器专为增强整体手操作而设计。此外,我们还引入了一种以准直驱动作为核心的施力方案,以实现类似于人类的力量和速度,同时确保在大规模数据收集过程中的稳健性。我们对EyeSight Hand在三个具有挑战性的任务上的表现进行了评估:开瓶子、塑料刀切割和托盘装配,这些任务需要复杂的操作、工具使用和精确的力量施加。在这些任务上,通过训练模仿学习模型并采用新的视觉 dropout策略,展示了触觉反馈在提高任务成功率方面的优势。我们的结果表明,集成式触觉传感显著提高了任务性能,突显了触觉信息在灵巧操作中的关键作用。

URL

https://arxiv.org/abs/2408.06265

PDF

https://arxiv.org/pdf/2408.06265.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot