No Attention is Needed: Grouped Spatial-temporal Shift for Simple and Efficient Video Restorers

2022-06-22 02:16:47

Dasong Li, Xiaoyu Shi, Yi Zhang, Xiaogang Wang, Hongwei Qin, Hongsheng Li

arXiv_CV

arXiv_CV Attention Pose Denoising Optical_Flow Restoration

Abstract
Abstract (translated)
URL
PDF

Abstract

Video restoration, aiming at restoring clear frames from degraded videos, has been attracting increasing attention. Video restoration is required to establish the temporal correspondences from multiple misaligned frames. To achieve that end, existing deep methods generally adopt complicated network architectures, such as integrating optical flow, deformable convolution, cross-frame or cross-pixel self-attention layers, resulting in expensive computational cost. We argue that with proper design, temporal information utilization in video restoration can be much more efficient and effective. In this study, we propose a simple, fast yet effective framework for video restoration. The key of our framework is the grouped spatial-temporal shift, which is simple and lightweight, but can implicitly establish inter-frame correspondences and achieve multi-frame aggregation. Coupled with basic 2D U-Nets for frame-wise encoding and decoding, such an efficient spatial-temporal shift module can effectively tackle the challenges in video restoration. Extensive experiments show that our framework surpasses previous state-of-the-art method with 43% of its computational cost on both video deblurring and video denoising.

Abstract (translated)

URL

https://arxiv.org/abs/2206.10810

PDF

https://arxiv.org/pdf/2206.10810.pdf