Abstract
We introduce SharpNet, a method that predicts an accurate depth map for an input color image, with a particular attention to the reconstruction of occluding contours: Occluding contours are an important cue for object recognition, and for realistic integration of virtual objects in Augmented Reality, but they are also notoriously difficult to reconstruct accurately. For example, they are a challenge for stereo-based reconstruction methods, as points around an occluding contour are visible in only one image. Inspired by recent methods that introduce normal estimation to improve depth prediction, we introduce a novel term that constrains depth and occluding contours predictions. Since ground truth depth is difficult to obtain with pixel-perfect accuracy along occluding contours, we use synthetic images for training, followed by fine-tuning on real data. We demonstrate our approach on the challenging NYUv2-Depth dataset, and show that our method outperforms the state-of-the-art along occluding contours, while performing on par with the best recent methods for the rest of the images. Its accuracy along the occluding contours is actually better than the `ground truth' acquired by a depth camera based on structured light. We show this by introducing a new benchmark based on NYUv2-Depth for evaluating occluding contours in monocular reconstruction, which is our second contribution.
Abstract (translated)
我们介绍了夏普网(sharpnet),一种预测输入彩色图像精确深度图的方法,特别关注阻塞轮廓的重建:阻塞轮廓是对象识别和增强现实中虚拟对象的真实整合的重要提示,但它们也很难识别。结构准确。例如,对于基于立体的重建方法来说,它们是一个挑战,因为一个封闭轮廓周围的点只在一个图像中可见。受引入正态估计改进深度预测方法的启发,我们引入了一个新的术语来约束深度和阻塞轮廓预测。由于沿遮挡轮廓线很难获得像素完全准确的地面真实深度,因此我们使用合成图像进行训练,然后对真实数据进行微调。我们在具有挑战性的nyuv2深度数据集上演示了我们的方法,并表明我们的方法在封闭轮廓方面优于最先进的方法,同时与其他图像的最新方法一样执行。它在遮挡轮廓上的精度实际上比基于结构化光的深度照相机所获得的“地面真实性”要好。我们通过引入一个新的基于nyuv2深度的基准来评估单眼重建中的阻塞轮廓,这是我们的第二个贡献。
URL
https://arxiv.org/abs/1905.08598