DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

2021-08-12 17:51:18

Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang

arXiv_CV

Abstract
Abstract (translated)
URL
PDF

Abstract

While we have made significant progress on understanding hand-object interactions in computer vision, it is still very challenging for robots to perform complex dexterous manipulation. In this paper, we propose a new platform and pipeline, DexMV (Dex Manipulation from Videos), for imitation learning to bridge the gap between computer vision and robot learning. We design a platform with: (i) a simulation system for complex dexterous manipulation tasks with a multi-finger robot hand and (ii) a computer vision system to record large-scale demonstrations of a human hand conducting the same tasks. In our new pipeline, we extract 3D hand and object poses from the videos, and convert them to robot demonstrations via motion retargeting. We then apply and compare multiple imitation learning algorithms with the demonstrations. We show that the demonstrations can indeed improve robot learning by a large margin and solve the complex tasks which reinforcement learning alone cannot solve. Project page with video: this https URL

Abstract (translated)

URL

https://arxiv.org/abs/2108.05877

PDF

https://arxiv.org/pdf/2108.05877.pdf