Abstract
Dexterous manipulation with contact-rich interactions is crucial for advanced robotics. While recent diffusion-based planning approaches show promise for simpler manipulation tasks, they often produce unrealistic ghost states (e.g., the object automatically moves without hand contact) or lack adaptability when handling complex sequential interactions. In this work, we introduce DexDiffuser, an interaction-aware diffusion planning framework for adaptive dexterous manipulation. DexDiffuser models joint state-action dynamics through a dual-phase diffusion process which consists of pre-interaction contact alignment and post-contact goal-directed control, enabling goal-adaptive generalizable dexterous manipulation. Additionally, we incorporate dynamics model-based dual guidance and leverage large language models for automated guidance function generation, enhancing generalizability for physical interactions and facilitating diverse goal adaptation through language cues. Experiments on physical interaction tasks such as door opening, pen and block re-orientation, and hammer striking demonstrate DexDiffuser's effectiveness on goals outside training distributions, achieving over twice the average success rate (59.2% vs. 29.5%) compared to existing methods. Our framework achieves 70.0% success on 30-degree door opening, 40.0% and 36.7% on pen and block half-side re-orientation respectively, and 46.7% on hammer nail half drive, highlighting its robustness and flexibility in contact-rich manipulation.
Abstract (translated)
灵巧操作以及丰富的接触交互对于高级机器人技术至关重要。尽管最近基于扩散的规划方法在简单的操作任务上显示出潜力,但它们经常产生不现实的“幽灵状态”(例如,物体在没有手部接触的情况下自动移动),或者在处理复杂的顺序交互时缺乏适应性。在这项工作中,我们介绍了DexDiffuser,这是一个用于自适应灵巧操作的、具备互动感知的扩散规划框架。DexDiffuser通过一个双阶段扩散过程来建模联合状态-动作动力学,该过程包括预接触对齐和后接触目标导向控制,从而实现目标自适应且可泛化的灵巧操作。此外,我们整合了基于动力学模型的双重指导,并利用大型语言模型生成自动指导函数,增强了物理交互的泛化能力并通过语言提示促进多样化的目标适应性。在诸如开门、铅笔和方块重新定向以及锤击等物理互动任务上的实验显示,DexDiffuser在外部分布的目标上表现出有效性,其平均成功率(59.2% vs 29.5%)是现有方法的两倍以上。我们的框架在30度开门任务中实现了70.0%的成功率,在铅笔和方块半侧重新定向任务中的成功率分别为40.0%和36.7%,在锤钉半驱动任务中达到了46.7%的成功率,这突显了其在丰富的接触操作中的稳健性和灵活性。
URL
https://arxiv.org/abs/2411.18562