Abstract
Many video enhancement algorithms rely on optical flow to register frames in a video sequence. Precise flow estimation is however intractable; and optical flow itself is often a sub-optimal representation for particular video processing tasks. In this paper, we propose task-oriented flow (TOFlow), a motion representation learned in a self-supervised, task-specific manner. We design a neural network with a trainable motion estimation component and a video processing component, and train them jointly to learn the task-oriented flow. For evaluation, we build Vimeo-90K, a large-scale, high-quality video dataset for low-level video processing. TOFlow outperforms traditional optical flow on standard benchmarks as well as our Vimeo-90K dataset in three video processing tasks: frame interpolation, video denoising/deblocking, and video super-resolution.
Abstract (translated)
许多视频增强算法依靠光流在视频序列中注册帧。然而,精确的流量估计是难以解决的;光流本身往往是特定视频处理任务的次优表示。在本文中,我们提出了面向任务的流(toflow),这是一种以自我监督、特定于任务的方式学习的运动表示。我们设计了一个带有可训练运动估计组件和视频处理组件的神经网络,并对它们进行联合训练,以学习面向任务的流程。为了进行评估,我们构建了VIMEO-90K,一个大型的、高质量的视频数据集,用于低级别的视频处理。在三个视频处理任务中,Toflow在标准基准点和VIMEO-90K数据集方面优于传统的光流:帧插值、视频去噪/去块和视频超分辨率。
URL
https://arxiv.org/abs/1711.09078