GIFT: Generalizable Interaction-aware Functional Tool Affordances without Labels

2021-06-28 20:43:35

Dylan Turpin, Liquan Wang, Stavros Tsogkas, Sven Dickinson, Animesh Garg

arXiv_RO

Abstract
Abstract (translated)
URL
PDF

Abstract

Tool use requires reasoning about the fit between an object's affordances and the demands of a task. Visual affordance learning can benefit from goal-directed interaction experience, but current techniques rely on human labels or expert demonstrations to generate this data. In this paper, we describe a method that grounds affordances in physical interactions instead, thus removing the need for human labels or expert policies. We use an efficient sampling-based method to generate successful trajectories that provide contact data, which are then used to reveal affordance representations. Our framework, GIFT, operates in two phases: first, we discover visual affordances from goal-directed interaction with a set of procedurally generated tools; second, we train a model to predict new instances of the discovered affordances on novel tools in a self-supervised fashion. In our experiments, we show that GIFT can leverage a sparse keypoint representation to predict grasp and interaction points to accommodate multiple tasks, such as hooking, reaching, and hammering. GIFT outperforms baselines on all tasks and matches a human oracle on two of three tasks using novel tools.

Abstract (translated)

URL

https://arxiv.org/abs/2106.14973

PDF

https://arxiv.org/pdf/2106.14973.pdf