Abstract
Task-Oriented Grasping (TOG) presents a significant challenge, requiring a nuanced understanding of task semantics, object affordances, and the functional constraints dictating how an object should be grasped for a specific task. To address these challenges, we introduce GRIM (Grasp Re-alignment via Iterative Matching), a novel training-free framework for task-oriented grasping. Initially, a coarse alignment strategy is developed using a combination of geometric cues and principal component analysis (PCA)-reduced DINO features for similarity scoring. Subsequently, the full grasp pose associated with the retrieved memory instance is transferred to the aligned scene object and further refined against a set of task-agnostic, geometrically stable grasps generated for the scene object, prioritizing task compatibility. In contrast to existing learning-based methods, GRIM demonstrates strong generalization capabilities, achieving robust performance with only a small number of conditioning examples.
Abstract (translated)
任务导向抓取(TOG)提出了一个重大挑战,需要对任务语义、物体功效以及决定特定任务中如何抓取物体的功能约束有细微的理解。为了解决这些挑战,我们引入了GRIM(通过迭代匹配进行抓取再定位),这是一种新颖的无需训练的任务导向抓取框架。最初,使用几何线索和主成分分析(PCA)降维后的DINO特征相结合的方法开发了一种粗略对齐策略来进行相似度评分。随后,将检索到的记忆实例相关的完整抓取姿态转移到与场景对象对齐的对象上,并进一步针对为该场景对象生成的一组任务无关、几何稳定的抓取进行细化,优先考虑任务兼容性。与现有的基于学习的方法不同,GRIM展示了强大的泛化能力,在仅有少量条件示例的情况下仍能实现稳健的性能。
URL
https://arxiv.org/abs/2506.15607