Abstract
We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach both components are jointly estimated, which proved to be beneficial for improving estimation accuracy. Additionally, we propose a new network architecture, which combines coarse and fine components by treating the fine estimation as a refinement built on top of the coarse solution, and therefore adding details to the general prediction. We apply our approach to the challenging problem of optical flow estimation and empirically validate it against state-of-the-art CNN-based solutions trained from scratch and tested on large optical flow datasets.
Abstract (translated)
我们提出了一种使用CNN的密集像素估计任务的新颖表示,通过明确地利用联合粗 - 细推理来提高准确性并缩短训练时间。在离散分类空间上执行粗略推理以获得一般粗略解,而在连续回归空间上获得解的精细细节。在我们的方法中,两个组件是联合估计的,这被证明有利于提高估计准确性。此外,我们提出了一种新的网络架构,它将粗略和精细组件结合起来,将精细估计视为在粗解决方案之上构建的细化,从而为一般预测添加细节。我们将我们的方法应用于光流估计的挑战性问题,并根据从头开始训练并在大型光流数据集上测试的最先进的基于CNN的解决方案进行经验验证。
URL
https://arxiv.org/abs/1808.07416