Paper Reading AI Learner

Optimization-Inspired Cross-Attention Transformer for Compressive Sensing

2023-04-27 07:21:30
Jiechong Song, Chong Mou, Shiqi Wang, Siwei Ma, Jian Zhang

Abstract

By integrating certain optimization solvers with deep neural networks, deep unfolding network (DUN) with good interpretability and high performance has attracted growing attention in compressive sensing (CS). However, existing DUNs often improve the visual quality at the price of a large number of parameters and have the problem of feature information loss during iteration. In this paper, we propose an Optimization-inspired Cross-attention Transformer (OCT) module as an iterative process, leading to a lightweight OCT-based Unfolding Framework (OCTUF) for image CS. Specifically, we design a novel Dual Cross Attention (Dual-CA) sub-module, which consists of an Inertia-Supplied Cross Attention (ISCA) block and a Projection-Guided Cross Attention (PGCA) block. ISCA block introduces multi-channel inertia forces and increases the memory effect by a cross attention mechanism between adjacent iterations. And, PGCA block achieves an enhanced information interaction, which introduces the inertia force into the gradient descent step through a cross attention block. Extensive CS experiments manifest that our OCTUF achieves superior performance compared to state-of-the-art methods while training lower complexity. Codes are available at this https URL.

Abstract (translated)

通过将某些优化求解器和深度神经网络集成起来,具有良好解释性和高性能的深度展开网络(DUN)在压缩感知(CS)中越来越受到关注。然而,现有的DUN往往通过大量参数来提高视觉质量,并且在迭代过程中会出现特征信息丢失的问题。在本文中,我们提出一种基于优化的交叉注意力Transformer(OCT)模块作为迭代过程,从而生成轻量级的基于OCT的图像展开框架(OCTUF)。具体来说,我们设计了一个独特的双重交叉注意力(Dual-CA)子模块,其中包含一个惯性提供交叉注意力(ISCA)块和一个投影引导交叉注意力(PGCA)块。ISCA块引入了多通道惯性力,并通过相邻迭代中的交叉注意力机制增加记忆效应。PGCA块实现了增强的信息交互,通过交叉注意力块将惯性力引入梯度下降步骤。广泛的CS实验表明,我们的OCTUF在训练复杂性较低的情况下比现有方法表现出更好的性能。代码可在该httpsURL上获取。

URL

https://arxiv.org/abs/2304.13986

PDF

https://arxiv.org/pdf/2304.13986.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot