Abstract
Image-goal navigation enables a robot to reach the location where a target image was captured, using visual cues for guidance. However, current methods either rely heavily on data and computationally expensive learning-based approaches or lack efficiency in complex environments due to insufficient exploration strategies. To address these limitations, we propose Bayesian Embodied Image-goal Navigation Using Gaussian Splatting, a novel method that formulates ImageNav as an optimal control problem within a model predictive control framework. BEINGS leverages 3D Gaussian Splatting as a scene prior to predict future observations, enabling efficient, real-time navigation decisions grounded in the robot's sensory experiences. By integrating Bayesian updates, our method dynamically refines the robot's strategy without requiring extensive prior experience or data. Our algorithm is validated through extensive simulations and physical experiments, showcasing its potential for embodied robot systems in visually complex scenarios.
Abstract (translated)
图像目标导航使机器人能够使用视觉提示到达捕捉目标图像的位置,实现基于视觉的指导。然而,现有方法要么过于依赖数据和计算密集型基于学习的 approach,要么在复杂环境中缺乏效率,因为缺乏足够的探索策略。为了克服这些局限,我们提出了使用高斯展平的高熵图像目标导航方法,一种将图像导航视为模型预测控制框架中的最优控制问题的新颖方法。BEINGS 利用 3D 高斯展平作为场景先验,从而实现基于机器人感官体验的高效、实时导航决策。通过实现贝叶斯更新,我们的方法动态地优化机器人的策略,而无需大量的前经验或数据。通过广泛的仿真和实验验证,我们的算法证明了其在视觉复杂场景中 embodied 机器人系统的巨大潜力。
URL
https://arxiv.org/abs/2409.10216