Abstract
Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree structure for representing belief in 3D, we present GenMOS (Generalized Multi-Object Search), the first general-purpose system for multi-object search (MOS) in a 3D region that is robot-independent and environment-agnostic. GenMOS takes as input point cloud observations of the local region, object detection results, and localization of the robot's view pose, and outputs a 6D viewpoint to move to through online planning. In particular, GenMOS uses point cloud observations in three ways: (1) to simulate occlusion; (2) to inform occupancy and initialize octree belief; and (3) to sample a belief-dependent graph of view positions that avoid obstacles. We evaluate our system both in simulation and on two real robot platforms. Our system enables, for example, a Boston Dynamics Spot robot to find a toy cat hidden underneath a couch in under one minute. We further integrate 3D local search with 2D global search to handle larger areas, demonstrating the resulting system in a 25m$^2$ lobby area.
Abstract (translated)
寻找对象是机器人的基本技能。因此,我们希望对象搜索最终能够成为机器人的常备能力,类似于物体检测和SLAM。然而,然而,不存在适用于真实机器人和环境的3D对象搜索系统。在本文中,基于利用octree结构在3D中表示信念的最新理论框架,我们提出了GenMOS(通用多物体搜索),它是第一个在3D区域中通用的多物体搜索(MOS)系统。 GenMOS以本地区域点云观测、物体检测结果和机器人视图姿态的定位作为输入,并输出一个6D视角,通过在线规划进行移动。特别是,GenMOS通过以下方式使用点云观测:(1)模拟遮挡;(2)通知占据并初始化octree信念;(3)样本避免障碍物的信念依赖图形。我们在模拟和两个真实机器人平台上评估了我们的系统。我们的系统使例如波士顿动力的Spot机器人能够在不到一分钟的时间内找到藏在 couch下面的玩具猫。我们进一步将3D本地搜索与2D全球搜索集成,以处理更大的区域,并在一个25m$^2$的展览区域展示了 resulting 系统。
URL
https://arxiv.org/abs/2303.03178