Abstract
Room layout estimation is a long-existing robotic vision task that benefits both environment sensing and motion planning. However, layout estimation using point clouds (PCs) still suffers from data scarcity due to annotation difficulty. As such, we address the semi-supervised setting of this task based upon the idea of model exponential moving averaging. But adapting this scheme to the state-of-the-art (SOTA) solution for PC-based layout estimation is not straightforward. To this end, we define a quad set matching strategy and several consistency losses based upon metrics tailored for layout quads. Besides, we propose a new online pseudo-label harvesting algorithm that decomposes the distribution of a hybrid distance measure between quads and PC into two components. This technique does not need manual threshold selection and intuitively encourages quads to align with reliable layout points. Surprisingly, this framework also works for the fully-supervised setting, achieving a new SOTA on the ScanNet benchmark. Last but not least, we also push the semi-supervised setting to the realistic omni-supervised setting, demonstrating significantly promoted performance on a newly annotated ARKitScenes testing set. Our codes, data and models are released in this repository.
Abstract (translated)
房间布局估计是一个长期存在的机器人视觉任务,既受益于环境感知又受益于运动规划。然而,使用点云(PC)进行布局估计仍然由于标注困难导致数据匮乏。因此,我们基于模型指数移动平均的想法解决了这个任务的半监督设置。但是,将这个方案适应为基于PC-based布局估计的最新(最优)解决方案并不是那么简单。为此,我们定义了一个适合布局 quad 的 metrics 并计算了几个一致性损失,基于这些 metrics 制定了一个新的关系型标记收集算法。这个算法将混合距离测量在 quad 和 PC 之间的分布分解为两个组件。这个技术不需要手动阈值选择,直觉地鼓励 quads 与可靠的布局点对齐。令人惊讶地,这个框架还适用于全监督设置,在扫描Net 基准测试集上实现了新的最优解决方案。最后但同样重要的是,我们也将半监督设置推向现实的多监督设置,在一个新注释的 ARKitScenes 测试集中展示了显著的性能提升。我们的代码、数据和模型在此仓库中发布。
URL
https://arxiv.org/abs/2301.13865