Paper Reading AI Learner

Parameter-Free Channel Attention for Image Classification and Super-Resolution

2023-03-20 12:08:58
Yuxuan Shi, Lingxiao Yang, Wangpeng An, Xiantong Zhen, Liuqing Wang

Abstract

The channel attention mechanism is a useful technique widely employed in deep convolutional neural networks to boost the performance for image processing tasks, eg, image classification and image super-resolution. It is usually designed as a parameterized sub-network and embedded into the convolutional layers of the network to learn more powerful feature representations. However, current channel attention induces more parameters and therefore leads to higher computational costs. To deal with this issue, in this work, we propose a Parameter-Free Channel Attention (PFCA) module to boost the performance of popular image classification and image super-resolution networks, but completely sweep out the parameter growth of channel attention. Experiments on CIFAR-100, ImageNet, and DIV2K validate that our PFCA module improves the performance of ResNet on image classification and improves the performance of MSRResNet on image super-resolution tasks, respectively, while bringing little growth of parameters and FLOPs.

Abstract (translated)

通道注意力机制是深度学习卷积神经网络中广泛使用的一种有用技术,用于提高图像处理任务的性能,例如图像分类和图像超分辨率。通常被设计为一个参数化的子网络,并嵌入到网络的卷积层中,以学习更强大的特征表示。然而,当前的通道注意力机制导致更多的参数,因此导致更高的计算成本。为了解决这一问题,在本文中,我们提出了一个参数免费的通道注意力(PFCA)模块,以Boost popular图像分类和图像超分辨率网络的性能,但完全消除通道注意力参数的增长。在CIFAR-100、ImageNet和DIV2K等实验中,证明了我们的PFCA模块分别提高了ResNet在图像分类任务中的表现,并提高了MSRResNet在图像超分辨率任务中的表现,而参数和FLOPs几乎没有增长。

URL

https://arxiv.org/abs/2303.11055

PDF

https://arxiv.org/pdf/2303.11055.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot