Abstract
Recently, the methods based on implicit neural representations have shown excellent capabilities for arbitrary-scale super-resolution (ASSR). Although these methods represent the features of an image by generating latent codes, these latent codes are difficult to adapt for different magnification factors of super-resolution, which seriously affects their performance. Addressing this, we design Multi-Scale Implicit Transformer (MSIT), consisting of an Multi-scale Neural Operator (MSNO) and Multi-Scale Self-Attention (MSSA). Among them, MSNO obtains multi-scale latent codes through feature enhancement, multi-scale characteristics extraction, and multi-scale characteristics merging. MSSA further enhances the multi-scale characteristics of latent codes, resulting in better performance. Furthermore, to improve the performance of network, we propose the Re-Interaction Module (RIM) combined with the cumulative training strategy to improve the diversity of learned information for the network. We have systematically introduced multi-scale characteristics for the first time in ASSR, extensive experiments are performed to validate the effectiveness of MSIT, and our method achieves state-of-the-art performance in arbitrary super-resolution tasks.
Abstract (translated)
近年来,基于隐式神经表示的方法在任意规模超分辨率(ASSR)中表现出了卓越的性能。尽管这些方法通过生成隐含码来表示图像的特征,但这些隐含码对于不同放大的超分辨率因素来说很难进行适应,这严重地影响了其性能。为了解决这个问题,我们设计了一个多尺度隐式Transformer(MSIT),由多尺度神经操作(MSNO)和多尺度自注意(MSSA)组成。其中,MSNO通过特征增强、多尺度特征提取和多尺度特征合并来获得多尺度隐含码。MSSA进一步增强了多尺度特征,从而提高了性能。此外,为了提高网络的性能,我们采用累积训练策略与重新交互模块(RIM)相结合,以提高网络学习信息多样性。我们首次系统地引入了多尺度特征到ASSR中,并通过大量实验验证了MSIT的有效性。在任意超分辨率任务中,我们的方法实现了与最先进水平相当的表现。
URL
https://arxiv.org/abs/2403.06536