Paper Reading AI Learner

MTS-CSNet: Multiscale Tensor Factorization for Deep Compressive Sensing on RGB Images

2026-02-04 20:38:04
Mehmet Yamac, Lei Xu, Serkan Kiranyaz, Moncef Gabbouj

Abstract

Deep learning based compressive sensing (CS) methods typically learn sampling operators using convolutional or block wise fully connected layers, which limit receptive fields and scale poorly for high dimensional data. We propose MTSCSNet, a CS framework based on Multiscale Tensor Summation (MTS) factorization, a structured operator for efficient multidimensional signal processing. MTS performs mode-wise linear transformations with multiscale summation, enabling large receptive fields and effective modeling of cross-dimensional correlations. In MTSCSNet, MTS is first used as a learnable CS operator that performs linear dimensionality reduction in tensor space, with its adjoint defining the initial back-projection, and is then applied in the reconstruction stage to directly refine this estimate. This results in a simple feed-forward architecture without iterative or proximal optimization, while remaining parameter and computation efficient. Experiments on standard CS benchmarks show that MTSCSNet achieves state-of-the-art reconstruction performance on RGB images, with notable PSNR gains and faster inference, even compared to recent diffusion-based CS methods, while using a significantly more compact feed-forward architecture.

Abstract (translated)

基于深度学习的压缩感知(CS)方法通常使用卷积层或分块全连接层来学习采样算子,这些方法会限制感受野并导致在高维数据上的性能不佳。我们提出了MTSCSNet框架,该框架基于多尺度张量求和(MTS)分解,这是一种用于高效处理多维信号的结构化操作符。MTS通过模式级线性变换与多尺度求和实现大范围的感受野,并能有效建模跨维度的相关性。 在MTSCSNet中,首先使用MTS作为可学习的CS算子,在张量空间执行线性降维处理,其伴随算子定义了初始反投影。然后,在重建阶段应用MTS直接精炼该估计值。这一过程构建了一个简单的前馈架构,无需迭代或近似优化,并且保持参数和计算效率。 实验结果表明,基于标准CS基准测试的MTSCSNet在RGB图像重构性能上取得了当前最佳的结果,展示了显著的PSNR增益和更快的推理速度,即使与最近基于扩散的CS方法相比也表现出色。同时,它使用了一个更为紧凑的前馈架构。

URL

https://arxiv.org/abs/2602.07056

PDF

https://arxiv.org/pdf/2602.07056.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot