Abstract
Traditionally, 3d indoor datasets have generally prioritized scale over ground-truth accuracy in order to obtain improved generalization. However, using these datasets to evaluate dense geometry tasks, such as depth rendering, can be problematic as the meshes of the dataset are often incomplete and may produce wrong ground truth to evaluate the details. In this paper, we propose SCRREAM, a dataset annotation framework that allows annotation of fully dense meshes of objects in the scene and registers camera poses on the real image sequence, which can produce accurate ground truth for both sparse 3D as well as dense 3D tasks. We show the details of the dataset annotation pipeline and showcase four possible variants of datasets that can be obtained from our framework with example scenes, such as indoor reconstruction and SLAM, scene editing & object removal, human reconstruction and 6d pose estimation. Recent pipelines for indoor reconstruction and SLAM serve as new benchmarks. In contrast to previous indoor dataset, our design allows to evaluate dense geometry tasks on eleven sample scenes against accurately rendered ground truth depth maps.
Abstract (translated)
传统上,3D室内数据集通常优先考虑规模而不是地面真实度的准确性,以便获得更好的泛化能力。然而,使用这些数据集来评估密集几何任务(如深度渲染)可能会有问题,因为数据集中的网格往往是不完整的,并可能产生错误的地面真实值以评价细节。在本文中,我们提出了SCRREAM,这是一个数据集标注框架,允许对场景中物体进行完全密集的网格标注,并在实际图像序列上注册相机姿态,从而为稀疏3D和密集3D任务都生成准确的地面真实值。我们展示了数据集标注流水线的细节,并通过例如室内重建和SLAM、场景编辑与对象移除、人体重建及6D位姿估计等示例场景,展示了可以从我们的框架中获得的四种可能的数据集变体。最近用于室内重建和SLAM的管道作为新的基准。与之前的室内数据集相比,我们的设计允许在11个样本场景上通过准确渲染的地面真实深度图来评估密集几何任务。
URL
https://arxiv.org/abs/2410.22715