Abstract
Most of the classical denoising methods restore clear results by selecting and averaging pixels in the noisy input. Instead of relying on hand-crafted selecting and averaging strategies, we propose to explicitly learn this process with deep neural networks. Specifically, we propose deformable 2D kernels for image denoising where the sampling locations and kernel weights are both learned. The proposed kernel naturally adapts to image structures and could effectively reduce the oversmoothing artifacts. Furthermore, we develop 3D deformable kernels for video denoising to more efficiently sample pixels across the spatial-temporal space. Our method is able to solve the misalignment issues of large motion from dynamic scenes. For better training our video denoising model, we introduce the trilinear sampler and a new regularization term. We demonstrate that the proposed method performs favorably against the state-of-the-art image and video denoising approaches on both synthetic and real-world data.
Abstract (translated)
大多数经典的去噪方法通过选择和平均噪声输入中的像素来恢复清晰的结果。我们建议用深度神经网络来明确学习这个过程,而不是依靠手工制作的选择和平均策略。具体来说,我们提出了可变形的二维核图像去噪,在那里采样位置和核权重都是学习的。所提出的核自然地适应图像结构,并能有效地减少过光滑的伪影。此外,我们还开发了用于视频去噪的三维可变形内核,以便在时空空间更有效地采样像素。我们的方法能够解决动态场景中大运动的失准问题。为了更好地训练我们的视频去噪模型,我们引入了三线性采样和一个新的正则化项。我们证明了该方法在合成数据和真实数据的图像和视频去噪方面均优于最先进的图像和视频去噪方法。
URL
https://arxiv.org/abs/1904.06903