Paper Reading AI Learner

High-Similarity-Pass Attention for Single Image Super-Resolution

2023-05-25 06:24:14
Jian-Nan Su, Min Gan, Guang-Yong Chen, Wenzhong Guo, C. L. Philip Chen

Abstract

Recent developments in the field of non-local attention (NLA) have led to a renewed interest in self-similarity-based single image super-resolution (SISR). Researchers usually used the NLA to explore non-local self-similarity (NSS) in SISR and achieve satisfactory reconstruction results. However, a surprising phenomenon that the reconstruction performance of the standard NLA is similar to the NLA with randomly selected regions stimulated our interest to revisit NLA. In this paper, we first analyzed the attention map of the standard NLA from different perspectives and discovered that the resulting probability distribution always has full support for every local feature, which implies a statistical waste of assigning values to irrelevant non-local features, especially for SISR which needs to model long-range dependence with a large number of redundant non-local features. Based on these findings, we introduced a concise yet effective soft thresholding operation to obtain high-similarity-pass attention (HSPA), which is beneficial for generating a more compact and interpretable distribution. Furthermore, we derived some key properties of the soft thresholding operation that enable training our HSPA in an end-to-end manner. The HSPA can be integrated into existing deep SISR models as an efficient general building block. In addition, to demonstrate the effectiveness of the HSPA, we constructed a deep high-similarity-pass attention network (HSPAN) by integrating a few HSPAs in a simple backbone. Extensive experimental results demonstrate that HSPAN outperforms state-of-the-art approaches on both quantitative and qualitative evaluations.

Abstract (translated)

最近在非局部注意力(NLA)领域的发展引起了对基于自相似性的单图像超分辨率(SISR)的新兴趣。研究人员通常使用NLA在SISR中探索非局部自相似性(NSS)并取得了令人满意的重建结果。然而,一个令人惊讶的现象是,标准NLA的重建表现与随机选择区域的NLA相似,这激发了我们重新考虑NLA的兴趣。在本文中,我们首先从不同的角度分析了标准NLA的注意力地图,并发现结果的概率分布对所有 local 特征都有充分的支持,这意味着将值分配给无关的局部特征是一种统计浪费,特别是对于需要使用大量冗余的局部特征来建模长距离依赖的SISR。基于这些发现,我们介绍了一种简洁但有效的软阈值操作,以获得高相似性跳过注意力(HSPA),这有助于生成更紧凑并具有可解释性分布。此外,我们推导了一些软阈值操作的关键特性,以使其能够以端到端的方式训练我们的HSPA。HSPA可以集成到现有的深度SISR模型中,作为高效的一般构建块。此外,为了证明HSPA的有效性,我们构建了一个深度高相似性跳过注意力网络(HSPAN),并将几个HSPA集成在一个简单的骨架中。广泛的实验结果显示,HSPAN在量化和定性评估方面都优于最先进的方法。

URL

https://arxiv.org/abs/2305.15768

PDF

https://arxiv.org/pdf/2305.15768.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot