Abstract
Improving instance-specific image goal navigation (InstanceImageNav), which locates the identical object in a real-world environment from a query image, is essential for robotic systems to assist users in finding desired objects. The challenge lies in the domain gap between low-quality images observed by the moving robot, characterized by motion blur and low-resolution, and high-quality query images provided by the user. Such domain gaps could significantly reduce the task success rate but have not been the focus of previous work. To address this, we propose a novel method called Few-shot Cross-quality Instance-aware Adaptation (CrossIA), which employs contrastive learning with an instance classifier to align features between massive low- and few high-quality images. This approach effectively reduces the domain gap by bringing the latent representations of cross-quality images closer on an instance basis. Additionally, the system integrates an object image collection with a pre-trained deblurring model to enhance the observed image quality. Our method fine-tunes the SimSiam model, pre-trained on ImageNet, using CrossIA. We evaluated our method's effectiveness through an InstanceImageNav task with 20 different types of instances, where the robot identifies the same instance in a real-world environment as a high-quality query image. Our experiments showed that our method improves the task success rate by up to three times compared to the baseline, a conventional approach based on SuperGlue. These findings highlight the potential of leveraging contrastive learning and image enhancement techniques to bridge the domain gap and improve object localization in robotic applications. The project website is this https URL.
Abstract (translated)
提高实例特定图像目标导航(InstanceImageNav)对于机器人系统协助用户在现实环境中找到所需物品至关重要。挑战在于移动机器人观测到的低质量图像与用户提供的优质图像之间的领域差距。这种领域差距可能会显著降低任务成功率,但以前的工作并未将此作为重点。为解决这个问题,我们提出了名为Few-shot Cross-quality Instance-aware Adaptation(CrossIA)的新方法,该方法采用对比学习与实例分类器来将大型低质量图像和少量高质量图像的特征对齐。这种方法通过在实例基础上将跨质量图像的潜在表示拉近,有效减少了领域差距。此外,系统还集成了一个预训练去雾模型来提高观测到的图像质量。我们使用CrossIA对SimSiam模型进行微调。我们对20种不同类型的实例进行了InstanceImageNav任务评估,机器人在一个真实环境中识别出相同实例作为高质量查询图像。我们的实验结果表明,与基于SuperGlue的传统方法相比,我们的方法将任务成功率提高了300%以上。这些发现强调了利用对比学习和图像增强技术跨越领域差距并在机器人应用中改善物体定位的可能性。项目网站是https://www.xxx。
URL
https://arxiv.org/abs/2404.09645