Multi-Scale Implicit Transformer with Re-parameterize for Arbitrary-Scale Super-Resolution

Abstract
Abstract (translated)
URL
PDF

Abstract

Recently, the methods based on implicit neural representations have shown excellent capabilities for arbitrary-scale super-resolution (ASSR). Although these methods represent the features of an image by generating latent codes, these latent codes are difficult to adapt for different magnification factors of super-resolution, which seriously affects their performance. Addressing this, we design Multi-Scale Implicit Transformer (MSIT), consisting of an Multi-scale Neural Operator (MSNO) and Multi-Scale Self-Attention (MSSA). Among them, MSNO obtains multi-scale latent codes through feature enhancement, multi-scale characteristics extraction, and multi-scale characteristics merging. MSSA further enhances the multi-scale characteristics of latent codes, resulting in better performance. Furthermore, to improve the performance of network, we propose the Re-Interaction Module (RIM) combined with the cumulative training strategy to improve the diversity of learned information for the network. We have systematically introduced multi-scale characteristics for the first time in ASSR, extensive experiments are performed to validate the effectiveness of MSIT, and our method achieves state-of-the-art performance in arbitrary super-resolution tasks.

Abstract (translated)

近年来，基于隐式神经表示的方法在任意规模超分辨率（ASSR）中表现出了卓越的性能。尽管这些方法通过生成隐含码来表示图像的特征，但这些隐含码对于不同放大的超分辨率因素来说很难进行适应，这严重地影响了其性能。为了解决这个问题，我们设计了一个多尺度隐式Transformer（MSIT），由多尺度神经操作（MSNO）和多尺度自注意（MSSA）组成。其中，MSNO通过特征增强、多尺度特征提取和多尺度特征合并来获得多尺度隐含码。MSSA进一步增强了多尺度特征，从而提高了性能。此外，为了提高网络的性能，我们采用累积训练策略与重新交互模块（RIM）相结合，以提高网络学习信息多样性。我们首次系统地引入了多尺度特征到ASSR中，并通过大量实验验证了MSIT的有效性。在任意超分辨率任务中，我们的方法实现了与最先进水平相当的表现。

URL

https://arxiv.org/abs/2403.06536

PDF

https://arxiv.org/pdf/2403.06536.pdf