In this work we study the behavior of the forward-backward (FB) algorithm when the proximity operator is replaced by a sub-iterative procedure to approximate a Gaussian denoiser, in a Plug-and-Play (PnP) fashion. In particular, we consider both analysis and synthesis Gaussian denoisers within a dictionary framework, obtained by unrolling dual-FB iterations or FB iterations, respectively. We analyze the associated minimization problems as well as the asymptotic behavior of the resulting FB-PnP iterations. In particular, we show that the synthesis Gaussian denoising problem can be viewed as a proximity operator. For each case, analysis and synthesis, we show that the FB-PnP algorithms solve the same problem whether we use only one or an infinite number of sub-iteration to solve the denoising problem at each iteration. To this aim, we show that each "one sub-iteration" strategy within the FB-PnP can be interpreted as a primal-dual algorithm when a warm-restart strategy is used. We further present similar results when using a Moreau-Yosida smoothing of the global problem, for an arbitrary number of sub-iterations. Finally, we provide numerical simulations to illustrate our theoretical results. In particular we first consider a toy compressive sensing example, as well as an image restoration problem in a deep dictionary framework.
在这项工作中,我们研究了在插件式(PnP)框架下用子迭代过程近似高斯去噪器时前向后向(FB)算法的行为。特别地,在字典框架内,我们将分析和综合高斯去噪器分别通过展开对偶-FB迭代或FB迭代获得。我们分析了相关的最小化问题以及由此产生的FB-PnP迭代的渐进行为。特别是,我们展示了合成高斯降噪问题可以被视为邻近算子。对于每种情况,无论是分析还是综合,我们都证明了无论是在每次迭代中只使用一次还是无限次子迭代来解决去噪问题,FB-PnP算法都会求解相同的问题。为此,我们展示了在使用热启动策略时,FB-PnP中的“单次子迭代”策略可以被解释为一种原始对偶算法。此外,当使用任意数量的子迭代进行全局问题的Moreau-Yosida平滑处理时,我们也提供了类似的结果。最后,我们通过数值模拟来说明我们的理论结果。特别地,首先考虑了一个玩具压缩感知示例以及深度字典框架下的图像恢复问题。
https://arxiv.org/abs/2411.13276
Compressive sensing (CS), acquiring and reconstructing signals below the Nyquist rate, has great potential in image and video acquisition to exploit data redundancy and greatly reduce the amount of sampled data. To further reduce the sampled data while keeping the video quality, this paper explores the temporal redundancy in video CS and proposes a block based adaptive compressive sensing framework with a sampling rate (SR) control strategy. To avoid redundant compression of non-moving regions, we first incorporate moving block detection between consecutive frames, and only transmit the measurements of moving blocks. The non-moving regions are reconstructed from the previous frame. In addition, we propose a block storage system and a dynamic threshold to achieve adaptive SR allocation to each frame based on the area of moving regions and target SR for controlling the average SR within the target SR. Finally, to reduce blocking artifacts and improve reconstruction quality, we adopt a cooperative reconstruction of the moving and non-moving blocks by referring to the measurements of the non-moving blocks from the previous frame. Extensive experiments have demonstrated that this work is able to control SR and obtain better performance than existing works.
压缩感知(CS),即在低于奈奎斯特率的情况下获取和重建信号,具有极大的潜力通过利用数据冗余来大幅度减少采样数据量,在图像和视频采集方面尤为突出。为了进一步减少采样数据同时保持视频质量,本文探索了视频CS中的时间冗余,并提出了一种基于块的自适应压缩感知框架,以及一种采样率(SR)控制策略。为了避免对非移动区域进行重复压缩,我们首先在连续帧之间加入了移动块检测功能,仅传输移动块的测量值。非移动区域则从前一帧中重建出来。此外,我们提出了一种块存储系统和动态阈值方法,根据移动区域的面积以及目标SR来实现对每帧自适应分配采样率,从而控制整体平均SR维持在目标SR水平内。最后,为了减少阻塞效应并提高重建质量,我们采用了通过参考前一帧中非移动块测量值的合作式移动和非移动区块重建方法。大量的实验表明,这项工作能够有效控制采样率,并且相比现有的研究取得了更好的性能表现。
https://arxiv.org/abs/2411.10200
Deep Neural Networks (DNNs) are well-known to act as over-parameterized deep image priors (DIP) that regularize various image inverse problems. Meanwhile, researchers also proposed extremely compact, under-parameterized image priors (e.g., deep decoder) that are strikingly competent for image restoration too, despite a loss of accuracy. These two extremes push us to think whether there exists a better solution in the middle: between over- and under-parameterized image priors, can one identify "intermediate" parameterized image priors that achieve better trade-offs between performance, efficiency, and even preserving strong transferability? Drawing inspirations from the lottery ticket hypothesis (LTH), we conjecture and study a novel "lottery image prior" (LIP) by exploiting DNN inherent sparsity, stated as: given an over-parameterized DNN-based image prior, it will contain a sparse subnetwork that can be trained in isolation, to match the original DNN's performance when being applied as a prior to various image inverse problems. Our results validate the superiority of LIPs: we can successfully locate the LIP subnetworks from over-parameterized DIPs at substantial sparsity ranges. Those LIP subnetworks significantly outperform deep decoders under comparably compact model sizes (by often fully preserving the effectiveness of their over-parameterized counterparts), and they also possess high transferability across different images as well as restoration task types. Besides, we also extend LIP to compressive sensing image reconstruction, where a pre-trained GAN generator is used as the prior (in contrast to untrained DIP or deep decoder), and confirm its validity in this setting too. To our best knowledge, this is the first time that LTH is demonstrated to be relevant in the context of inverse problems or image priors.
深度神经网络(DNNs)作为过度参数化的深层图像先验(DIP),已知能够正则化各种图像逆问题。与此同时,研究者也提出了极其紧凑、参数不足的图像先验(例如:深解码器),尽管在准确性上有所损失,但它们对图像恢复同样表现出色。这两种极端促使我们思考是否存在一个更好的中间解决方案:在过度和参数不足的图像先验之间,能否找到实现性能、效率以及保持强迁移能力之间更好平衡的“中间”参数化图像先验?借鉴彩票假设(LTH),我们提出了并研究了一种新颖的“彩票图像先验”(LIP)概念,通过利用DNN内在稀疏性来定义:给定一个基于过度参数化的DNN的图像先验,它将包含一个可以单独训练的稀疏子网络,在应用于各种图像逆问题时可匹配原始DNN的表现。我们的结果验证了LIP的优越性:我们可以在显著的稀疏范围内成功地从过度参数化的DIP中定位到LIP子网络。这些LIP子网络在与深解码器相当紧凑的模型尺寸下表现明显更优(通常完全保留其过度参数化对应物的有效性),并且它们还具备跨不同图像和恢复任务类型的高迁移能力。此外,我们还将LIP扩展到了压缩感知图像重建中,在这里使用预训练的GAN生成器作为先验(与未经过训练的DIP或深解码器相对比),并在这种情况下也证实了其有效性。据我们所知,这是首次在逆问题或图像先验背景下证明LTH的相关性。
https://arxiv.org/abs/2410.24187
The ability to estimate 3D movements of users over edge computing-enabled networks, such as 5G/6G networks, is a key enabler for the new era of extended reality (XR) and Metaverse applications. Recent advancements in deep learning have shown advantages over optimization techniques for estimating 3D human poses given spare measurements from sensor signals, i.e., inertial measurement unit (IMU) sensors attached to the XR devices. However, the existing works lack applicability to wireless systems, where transmitting the IMU signals over noisy wireless networks poses significant challenges. Furthermore, the potential redundancy of the IMU signals has not been considered, resulting in highly redundant transmissions. In this work, we propose a novel approach for redundancy removal and lightweight transmission of IMU signals over noisy wireless environments. Our approach utilizes a random Gaussian matrix to transform the original signal into a lower-dimensional space. By leveraging the compressive sensing theory, we have proved that the designed Gaussian matrix can project the signal into a lower-dimensional space and preserve the Set-Restricted Eigenvalue condition, subject to a power transmission constraint. Furthermore, we develop a deep generative model at the receiver to recover the original IMU signals from noisy compressed data, thus enabling the creation of 3D human body movements at the receiver for XR and Metaverse applications. Simulation results on a real-world IMU dataset show that our framework can achieve highly accurate 3D human poses of the user using only $82\%$ of the measurements from the original signals. This is comparable to an optimization-based approach, i.e., Lasso, but is an order of magnitude faster.
在扩展现实(XR)和元宇宙应用新时代中,通过边缘计算实现网络中用户3D运动估计的能力是一个关键的推动力。最近在深度学习方面的进步表明,在给定传感器信号的剩余测量的情况下,使用惯性测量单元(IMU)传感器附着的XR设备进行3D人体姿势估计的优化技术具有优势。然而,现有的工作缺乏适用于无线系统的适用性,因为在嘈杂的无线网络中传输IMU信号会带来重大挑战。此外,没有考虑到IMU信号的潜在冗余,导致高度冗余的传输。在这项工作中,我们提出了一种新的方法来去除冗余并轻量传输IMU信号在嘈杂无线环境中的方法。我们的方法利用随机高斯矩阵将原始信号转换为低维空间。通过利用压缩感知理论,我们证明了设计的高斯矩阵可以将信号投影到低维空间,并保持有限功率传输约束下的集合剩余范数条件。此外,我们在接收端开发了一个深度生成模型,用于从嘈杂压缩数据中恢复原始IMU信号,从而实现接收端XR和元宇宙应用中创建3D人体运动。在现实世界的IMU数据集上进行模拟结果表明,我们的框架可以使用仅来自原始信号的82%测量实现高度准确的用户3D姿势。这与基于优化的方法(如Lasso)相当,但速度更快。
https://arxiv.org/abs/2409.00087
Video Snapshot Compressive Imaging (SCI) uses a low-speed 2D camera to capture high-speed scenes as snapshot compressed measurements, followed by a reconstruction algorithm to retrieve the high-speed video frames. The fast evolving mobile devices and existing high-performance video SCI reconstruction algorithms motivate us to develop mobile reconstruction methods for real-world applications. Yet, it is still challenging to deploy previous reconstruction algorithms on mobile devices due to the complex inference process, let alone real-time mobile reconstruction. To the best of our knowledge, there is no video SCI reconstruction model designed to run on the mobile devices. Towards this end, in this paper, we present an effective approach for video SCI reconstruction, dubbed MobileSCI, which can run at real-time speed on the mobile devices for the first time. Specifically, we first build a U-shaped 2D convolution-based architecture, which is much more efficient and mobile-friendly than previous state-of-the-art reconstruction methods. Besides, an efficient feature mixing block, based on the channel splitting and shuffling mechanisms, is introduced as a novel bottleneck block of our proposed MobileSCI to alleviate the computational burden. Finally, a customized knowledge distillation strategy is utilized to further improve the reconstruction quality. Extensive results on both simulated and real data show that our proposed MobileSCI can achieve superior reconstruction quality with high efficiency on the mobile devices. Particularly, we can reconstruct a 256 X 256 X 8 snapshot compressed measurement with real-time performance (about 35 FPS) on an iPhone 15. Code is available at this https URL.
视频快照压缩成像(SCI)使用一种速度较低的2D相机来捕捉高速场景的快照压缩测量,然后通过重构算法来检索高速视频帧。快速发展的移动设备以及现有的高性能视频SCI复原算法,促使我们为实际应用开发移动复原方法。然而,将以前的重构算法应用于移动设备仍然具有挑战性,尤其是在实时移动复原方面。据我们所知,还没有专门为移动设备设计的视频SCI复原模型。因此,在本文中,我们提出了一个视频SCI复原的有效方法,被称为移动SCI,可以在移动设备上实现实时速度。具体来说,我们首先构建了一个U形2D卷积基架构,这是比以前最先进的复原方法更高效和便携的架构。此外,我们还引入了一个基于通道分割和重排机制的高效特征混合块,作为我们提出的移动SCI的新瓶颈块,减轻计算负担。最后,我们还采用了一种自定义的 Knowledge Distillation 策略来进一步提高复原质量。在模拟数据和真实数据上的广泛结果表明,与移动设备的高效运行相比,我们提出的移动SCI具有卓越的复原质量。特别地,在iPhone 15上,我们可以在实时性能(约35 FPS)下重构256 X 256 X 8的快照压缩测量。代码可在此处下载:https://www.xcode.com/
https://arxiv.org/abs/2408.07530
High-speed railway (HSR) communications are pivotal for ensuring rail safety, operations, maintenance, and delivering passenger information services. The high speed of trains creates rapidly time-varying wireless channels, increases the signaling overhead, and reduces the system throughput, making it difficult to meet the growing and stringent needs of HSR applications. In this article, we explore artificial intelligence (AI)-based beam-level and cell-level mobility management suitable for HSR communications, including the use cases, inputs, outputs, and key performance indicators (KPI)s of AI models. Particularly, in comparison to traditional down-sampling spatial beam measurements, we show that the compressed spatial multi-beam measurements via compressive sensing lead to improved spatial-temporal beam prediction. Moreover, we demonstrate the performance gains of AI-assisted cell handover over traditional mobile handover mechanisms. In addition, we observe that the proposed approaches to reduce the measurement overhead achieve comparable radio link failure performance with the traditional approach that requires all the beam measurements of all cells, while the former methods can save 50% beam measurement overhead.
高速铁路(HSR)通信对于确保铁路安全、运营、维护和提供旅客信息服务至关重要。高速列车产生的快速时间变化的无线信道增加了信号覆盖,降低了系统带宽,使得满足HSR应用不断增长和日益严格的需求变得困难。在本文中,我们探讨了适用于HSR通信的人工智能(AI)基带和细胞级移动管理,包括使用案例、输入、输出和关键绩效指标(KPI)的AI模型的性能。特别地,与传统下采样空间波测量相比,我们证明了通过压缩感知进行压缩空间多波束测量促进了空间-时间波束预测的改善。此外,我们展示了AI辅助细胞 handover 相对于传统移动 handover 机制的性能提升。此外,我们观察到,通过降低测量开销实现的方法在无线电链路故障性能上与传统方法相媲美,而前者可以节省50%的波束测量开销。
https://arxiv.org/abs/2407.04336
Inverse imaging problems (IIPs) arise in various applications, with the main objective of reconstructing an image from its compressed measurements. This problem is often ill-posed for being under-determined with multiple interchangeably consistent solutions. The best solution inherently depends on prior knowledge or assumptions, such as the sparsity of the image. Furthermore, the reconstruction process for most IIPs relies significantly on the imaging (i.e. forward model) parameters, which might not be fully known, or the measurement device may undergo calibration drifts. These uncertainties in the forward model create substantial challenges, where inaccurate reconstructions usually happen when the postulated parameters of the forward model do not fully match the actual ones. In this work, we devoted to tackling accurate reconstruction under the context of a set of possible forward model parameters that exist. Here, we propose a novel Moment-Aggregation (MA) framework that is compatible with the popular IIP solution by using a neural network prior. Specifically, our method can reconstruct the signal by considering all candidate parameters of the forward model simultaneously during the update of the neural network. We theoretically demonstrate the convergence of the MA framework, which has a similar complexity with reconstruction under the known forward model parameters. Proof-of-concept experiments demonstrate that the proposed MA achieves performance comparable to the forward model with the known precise parameter in reconstruction across both compressive sensing and phase retrieval applications, with a PSNR gap of 0.17 to 1.94 over various datasets, including MNIST, X-ray, Glas, and MoNuseg. This highlights our method's significant potential in reconstruction under an uncertain forward model.
逆向成像问题(IIPs)在各种应用中出现,主要目的是从压缩测量中重构图像。这个问题通常是不确定的,因为存在多个可交换的一致解。最好的解决方案本质上取决于先验知识或假设,比如图像的稀疏性。此外,大多数IIP的重建过程对成像(即前向模型)参数的依赖性很大程度上,这些参数可能不完全知道,或者测量设备可能经历标定漂移。这些前向模型的不确定性在很大程度上导致了挑战,而当前向模型的假设参数不等于实际值时,通常会出现不准确的重构。在本文中,我们致力于解决在存在一组可能的前向模型参数的背景下实现准确重构的问题。这里,我们提出了一个新颖的Moment-Aggregation(MA)框架,它与流行的IIP解决方案兼容,并使用神经网络先验。具体来说,我们的方法可以在神经网络更新过程中同时考虑所有前向模型候选参数,从而重构信号。我们理论证明了MA框架的收敛,其与已知前向模型参数下的重构具有相似的复杂性。概念性实验证实,所提出的MA在压缩感知和相位恢复应用中,与已知精确参数的前向模型具有可比较的性能,各数据集的PSNR差值在0.17到1.94之间。这突出了在不确定前向模型的背景下进行重构的我们方法的潜在巨大价值。
https://arxiv.org/abs/2405.02944
Hyperspectral Imaging (HSI) is used in a wide range of applications such as remote sensing, yet the transmission of the HS images by communication data links becomes challenging due to the large number of spectral bands that the HS images contain together with the limited data bandwidth available in real applications. Compressive Sensing reduces the images by randomly subsampling the spectral bands of each spatial pixel and then it performs the image reconstruction of all the bands using recovery algorithms which impose sparsity in a certain transform domain. Since the image pixels are not strictly sparse, this work studies a data sparsification pre-processing stage prior to compression to ensure the sparsity of the pixels. The sparsified images are compressed $2.5\times$ and then recovered using the Generalized Orthogonal Matching Pursuit algorithm (gOMP) characterized by high accuracy, low computational requirements and fast convergence. The experiments are performed in five conventional hyperspectral images where the effect of different sparsification levels in the quality of the uncompressed as well as the recovered images is studied. It is concluded that the gOMP algorithm reconstructs the hyperspectral images with higher accuracy as well as faster convergence when the pixels are highly sparsified and hence at the expense of reducing the quality of the recovered images with respect to the original images.
超分辨率成像(HSI)在遥感和许多应用领域中都有广泛应用,然而,通过通信数据链传输HS图像变得具有挑战性,因为HS图像包含大量共同光谱带,同时现实应用中可用的数据带宽有限。压缩感知通过随机采样每个空间像素的频率带,然后使用恢复算法对所有频率带进行图像重建,这些算法在某个变换域中引入了稀疏性。由于图像像素不是严格稀疏的,因此本文研究了在压缩之前进行数据稀疏化预处理阶段,以确保像素的稀疏性。稀疏化的图像被压缩2.5倍,然后使用高精度、低计算要求和快速收敛的广义欧几里得匹配 pursuit(gOMP)算法进行恢复。实验在五种常规超分辨率图像上进行,研究了不同稀疏度级别对压缩和恢复图像质量的影响。得出的结论是,当像素高度稀疏时,gOMP算法在压缩和恢复超分辨率图像时具有更高的准确性和更快的收敛速度,但会降低恢复图像与原始图像之间的质量。
https://arxiv.org/abs/2401.14786
Hyperspectral Imaging comprises excessive data consequently leading to significant challenges for data processing, storage and transmission. Compressive Sensing has been used in the field of Hyperspectral Imaging as a technique to compress the large amount of data. This work addresses the recovery of hyperspectral images 2.5x compressed. A comparative study in terms of the accuracy and the performance of the convex FISTA/ADMM in addition to the greedy gOMP/BIHT/CoSaMP recovery algorithms is presented. The results indicate that the algorithms recover successfully the compressed data, yet the gOMP algorithm achieves superior accuracy and faster recovery in comparison to the other algorithms at the expense of high dependence on unknown sparsity level of the data to recover.
超分辨率成像包括 excessive data,因此对数据处理、存储和传输带来显著挑战。在超分辨率成像领域,压缩感知是一种用于压缩大量数据的压缩技术。本文研究了在超分辨率成像中恢复压缩后超分辨率图像2.5倍压缩的情况。在超分辨率图像恢复方面,对凸FISTA/ADMM的准确性和性能进行了比较研究,还包括贪婪梯度OMP/BIHT/CoSaMP恢复算法。结果显示,这些算法成功地恢复压缩后的数据,然而,与其它算法相比,gOMP算法在准确性和恢复速度方面具有优越性,但数据的不确定稀疏水平对算法恢复效果的影响较大。
https://arxiv.org/abs/2401.14762
Video Captioning (VC) is a challenging multi-modal task since it requires describing the scene in language by understanding various and complex videos. For machines, the traditional VC follows the "imaging-compression-decoding-and-then-captioning" pipeline, where compression is pivot for storage and transmission. However, in such a pipeline, some potential shortcomings are inevitable, i.e., information redundancy resulting in low efficiency and information loss during the sampling process for captioning. To address these problems, in this paper, we propose a novel VC pipeline to generate captions directly from the compressed measurement, which can be captured by a snapshot compressive sensing camera and we dub our model SnapCap. To be more specific, benefiting from the signal simulation, we have access to obtain abundant measurement-video-annotation data pairs for our model. Besides, to better extract language-related visual representations from the compressed measurement, we propose to distill the knowledge from videos via a pre-trained CLIP with plentiful language-vision associations to guide the learning of our SnapCap. To demonstrate the effectiveness of SnapCap, we conduct experiments on two widely-used VC datasets. Both the qualitative and quantitative results verify the superiority of our pipeline over conventional VC pipelines. In particular, compared to the "caption-after-reconstruction" methods, our SnapCap can run at least 3$\times$ faster, and achieve better caption results.
视频字幕(VC)是一个具有挑战性的多模态任务,因为需要通过理解各种复杂视频来用语言描述场景。对于机器来说,传统的VC沿着“图像压缩-解码-然后编码”的流程进行,压缩是存储和传输的关键。然而,在这样一个流程中,一些潜在的缺陷是无法避免的,即压缩过程中信息冗余导致效率低和信息丢失。为解决这些问题,本文提出了一种新颖的VC管道,可以直接从压缩测量中生成字幕,可以被快照压缩感知相机捕获,我们称之为SnapCap。 具体来说,利用信号仿真,我们获得了为我们的模型提供丰富测量视频注释数据对的能力。此外,为了更好地从压缩测量中提取与语言相关的视觉表示,我们通过预训练的CLIP(带有丰富语言视觉关联)来蒸馏知识,以指导我们的SnapCap的学习。为了证明SnapCap的有效性,我们在两个广泛使用的VC数据集上进行了实验。两个质量和数量结果证实了我们的管道优越于传统VC管道。特别是,与“在重构后进行编码”的方法相比,我们的SnapCap至少可以快3倍,并实现更好的字幕效果。
https://arxiv.org/abs/2401.04903
Compressive sensing (CS) is a technique that enables the recovery of sparse signals using fewer measurements than traditional sampling methods. To address the computational challenges of CS reconstruction, our objective is to develop an interpretable and concise neural network model for reconstructing natural images using CS. We achieve this by mapping one step of the iterative shrinkage thresholding algorithm (ISTA) to a deep network block, representing one iteration of ISTA. To enhance learning ability and incorporate structural diversity, we integrate aggregated residual transformations (ResNeXt) and squeeze-and-excitation (SE) mechanisms into the ISTA block. This block serves as a deep equilibrium layer, connected to a semi-tensor product network (STP-Net) for convenient sampling and providing an initial reconstruction. The resulting model, called MsDC-DEQ-Net, exhibits competitive performance compared to state-of-the-art network-based methods. It significantly reduces storage requirements compared to deep unrolling methods, using only one iteration block instead of multiple iterations. Unlike deep unrolling models, MsDC-DEQ-Net can be iteratively used, gradually improving reconstruction accuracy while considering computation trade-offs. Additionally, the model benefits from multi-scale dilated convolutions, further enhancing performance.
压缩感知(CS)是一种使用比传统采样方法更少的测量的技术,以恢复稀疏信号。为了应对CS重建的计算挑战,我们的目标是开发一个可解释且紧凑的神经网络模型,用于使用CS对自然图像进行重建。我们通过将迭代收缩阈值算法(ISTA)的迭代一步映射到深度网络块来实现这一目标,表示ISTA的迭代。为了提高学习能力和包含结构多样性,我们将聚合残差变换(ResNeXt)和压缩和激发(SE)机制集成到ISTA模块中。这个模块充当一个深度平衡层,连接到半张量产品网络(STP-Net),用于方便的采样和提供初始重建。所得到的模型,称为MsDC-DEQ-Net,与基于网络的先进方法相比具有竞争性能。它显著减少了存储需求,只需使用一个迭代块而不是多个迭代块。与深展开模型不同,MsDC-DEQ-Net可以迭代使用,在考虑计算开销的同时,逐渐提高重建准确性。此外,模型还利用多尺度扩散卷积,进一步提高性能。
https://arxiv.org/abs/2401.02884
Incorporating prior information into inverse problems, e.g. via maximum-a-posteriori estimation, is an important technique for facilitating robust inverse problem solutions. In this paper, we devise two novel approaches for linear inverse problems that permit problem-specific statistical prior selections within the compound Gaussian (CG) class of distributions. The CG class subsumes many commonly used priors in signal and image reconstruction methods including those of sparsity-based approaches. The first method developed is an iterative algorithm, called generalized compound Gaussian least squares (G-CG-LS), that minimizes a regularized least squares objective function where the regularization enforces a CG prior. G-CG-LS is then unrolled, or unfolded, to furnish our second method, which is a novel deep regularized (DR) neural network, called DR-CG-Net, that learns the prior information. A detailed computational theory on convergence properties of G-CG-LS and thorough numerical experiments for DR-CG-Net are provided. Due to the comprehensive nature of the CG prior, these experiments show that our unrolled DR-CG-Net outperforms competitive prior art methods in tomographic imaging and compressive sensing, especially in challenging low-training scenarios.
将先验信息融入反问题中,例如通过最大后验估计,是促进稳健反问题解的重要技术。在本文中,我们提出了两种新的线性反问题方法,允许在复合高斯(CG)分布中进行问题特定的统计先验选择。CG类包括许多在信号和图像重建方法中常用的先验,包括基于稀疏度的方法。我们开发的第一种方法是一个迭代算法,称为一般化复合高斯最小二乘(G-CG-LS)方法,它最小化一个正则化最小二乘目标函数,其中正则化强制执行CG先验。G-CG-LS然后展开或展开,以提供我们的第二种方法,即名为DR-CG-Net的新型深度正则化(DR)神经网络,它学习先验信息。关于G-CG-LS的收敛性质的详细计算理论和DR-CG-Net的深入数值实验都在本文中提供了。由于CG先验的全局性,这些实验表明,我们的未展开DR-CG-Net在断层成像和压缩感知领域优于竞争先驱技术,尤其是在具有挑战性的低训练场景中。
https://arxiv.org/abs/2311.17248
Underwater Sound Speed Profile (SSP) distribution has great influence on the propagation mode of acoustic signal, thus the fast and accurate estimation of SSP is of great importance in building underwater observation systems. The state-of-the-art SSP inversion methods include frameworks of matched field processing (MFP), compressive sensing (CS), and feedforeward neural networks (FNN), among which the FNN shows better real-time performance while maintain the same level of accuracy. However, the training of FNN needs quite a lot historical SSP samples, which is diffcult to be satisfied in many ocean areas. This situation is called few-shot learning. To tackle this issue, we propose a multi-task learning (MTL) model with partial parameter sharing among different traning tasks. By MTL, common features could be extracted, thus accelerating the learning process on given tasks, and reducing the demand for reference samples, so as to enhance the generalization ability in few-shot learning. To verify the feasibility and effectiveness of MTL, a deep-ocean experiment was held in April 2023 at the South China Sea. Results shows that MTL outperforms the state-of-the-art methods in terms of accuracy for SSP inversion, while inherits the real-time advantage of FNN during the inversion stage.
水下声速剖面(SSP)分布对声波传播模式有很大的影响,因此快速和准确地估计SSP对构建水下观测系统非常重要。最先进的SSP反演方法包括匹配场处理(MFP)框架、压缩感知(CS)和前馈神经网络(FNN)等。其中,FNN在保持相同准确性的同时具有更好的实时性能。然而,为了训练FNN,需要相当多的历史SSP样本,这在许多海洋区域中是难以满足的。这种情况称为欠样本学习。为了解决这个问题,我们提出了一个多任务学习(MTL)模型,其中不同训练任务之间共享部分参数。通过MTL,可以提取共同特征,从而加速在给定任务上的学习过程,并减少对参考样本的需求,从而增强在欠样本学习中的泛化能力。为了验证MTL的可行性和有效性,2023年4月在南海进行了一个深海实验。结果表明,MTL在SSP反演方面的准确性超过了最先进的方法,而在反演阶段,FNN具有实时优势。
https://arxiv.org/abs/2310.11708
The sparse modeling is an evident manifestation capturing the parsimony principle just described, and sparse models are widespread in statistics, physics, information sciences, neuroscience, computational mathematics, and so on. In statistics the many applications of sparse modeling span regression, classification tasks, graphical model selection, sparse M-estimators and sparse dimensionality reduction. It is also particularly effective in many statistical and machine learning areas where the primary goal is to discover predictive patterns from data which would enhance our understanding and control of underlying physical, biological, and other natural processes, beyond just building accurate outcome black-box predictors. Common examples include selecting biomarkers in biological procedures, finding relevant brain activity locations which are predictive about brain states and processes based on fMRI data, and identifying network bottlenecks best explaining end-to-end performance. Moreover, the research and applications of efficient recovery of high-dimensional sparse signals from a relatively small number of observations, which is the main focus of compressed sensing or compressive sensing, have rapidly grown and became an extremely intense area of study beyond classical signal processing. Likewise interestingly, sparse modeling is directly related to various artificial vision tasks, such as image denoising, segmentation, restoration and superresolution, object or face detection and recognition in visual scenes, and action recognition. In this manuscript, we provide a brief introduction of the basic theory underlying sparse representation and compressive sensing, and then discuss some methods for recovering sparse solutions to optimization problems in effective way, together with some applications of sparse recovery in a machine learning problem known as sparse dictionary learning.
稀疏建模是一种明显的表现,捕捉到我刚才描述的简洁性原则,稀疏模型在统计、物理、信息科学、神经科学、计算数学等领域广泛应用。在统计中,稀疏建模的许多应用涵盖了回归、分类任务、图形模型选择、稀疏高斯估计和稀疏维度减少。它还在许多统计和机器学习领域中特别有效,其主要目标是从数据中发现预测模式,这将增强我们对基础物理、生物和自然过程的理解和控制,超越了仅仅建立准确的黑盒预测器。常见的例子包括在生物学过程中选择生物标记物、基于FMRI数据的 Brain 活动位置找到相关的脑活动区域、并确定网络瓶颈的最佳解释,最有效地解释整体性能。此外,研究和应用从相对少量的观察数据中高效恢复高维稀疏信号的研究和应用,这是压缩感知或压缩感知的主要关注点,已经迅速增长并成为 classical 信号处理之外极为强烈的研究领域。类似地,稀疏建模直接与各种人工视觉任务相关,例如图像去噪、分割、恢复和超分辨率、视觉场景中的物体或面部检测和识别,以及动作识别。在本文中,我们简要介绍了稀疏表示和压缩感知的基础理论,然后讨论了如何有效地恢复优化问题的稀疏解决方案,以及在稀疏字典学习机器学习问题中的稀疏恢复应用。
https://arxiv.org/abs/2308.13960
We present a novel approach to implement compressive sensing in laser scanning microscopes (LSM), specifically in image scanning microscopy (ISM), using a single-photon avalanche diode (SPAD) array detector. Our method addresses two significant limitations in applying compressive sensing to LSM: the time to compute the sampling matrix and the quality of reconstructed images. We employ a fixed sampling strategy, skipping alternate rows and columns during data acquisition, which reduces the number of points scanned by a factor of four and eliminates the need to compute different sampling matrices. By exploiting the parallel images generated by the SPAD array, we improve the quality of the reconstructed compressive-ISM images compared to standard compressive confocal LSM images. Our results demonstrate the effectiveness of our approach in producing higher-quality images with reduced data acquisition time and potential benefits in reducing photobleaching.
我们提出了一种新颖的 approach 来实现激光扫描显微镜(LSM)中 compressive sensing 的实现,特别是图像扫描显微镜(ISM)中。我们使用了单光子并发射二极管(SPAD)阵列探测器来实现这一目标。我们的方法解决了在 LSM 中应用 compressive sensing 的两个重要限制:计算采样矩阵所需的时间以及重建图像的质量。我们采用了固定的采样策略,在数据收集过程中跳过交替的行和列,这减少了扫描点的个数,实现了 four 倍的减少,并取代了计算不同采样矩阵的需求。通过利用 SPAD 数组产生的并行图像,我们提高了重建的 compressive-ISM 图像的质量,与标准 compressive confocal LSM 图像相比。我们的结果表明,我们的方法可以在减少数据收集时间的同时提高图像质量,并可能有助于减少 photobleaching 的效果。
https://arxiv.org/abs/2307.09841
Video Compressed Sensing (VCS) aims to reconstruct multiple frames from one single captured measurement, thus achieving high-speed scene recording with a low-frame-rate sensor. Although there have been impressive advances in VCS recently, those state-of-the-art (SOTA) methods also significantly increase model complexity and suffer from poor generality and robustness, which means that those networks need to be retrained to accommodate the new system. Such limitations hinder the real-time imaging and practical deployment of models. In this work, we propose a Sampling-Priors-Augmented Deep Unfolding Network (SPA-DUN) for efficient and robust VCS reconstruction. Under the optimization-inspired deep unfolding framework, a lightweight and efficient U-net is exploited to downsize the model while improving overall performance. Moreover, the prior knowledge from the sampling model is utilized to dynamically modulate the network features to enable single SPA-DUN to handle arbitrary sampling settings, augmenting interpretability and generality. Extensive experiments on both simulation and real datasets demonstrate that SPA-DUN is not only applicable for various sampling settings with one single model but also achieves SOTA performance with incredible efficiency.
视频压缩感知(VCS)的目标是从单个捕获测量中重构多个帧,从而实现低帧率传感器的高速度场景录制。尽管最近在VCS方面取得了令人印象深刻的进展,但这些先进的方法也显著增加了模型的复杂性并出现了 poor generality和Robustness 的问题,这意味着这些网络需要适应新的系统并进行训练。这些限制妨碍了实时成像和模型的实际部署。在本文中,我们提出了一种采样先验增强深度展开网络(SPA-DUN)来高效和稳健地 VCS 重构。在基于优化的深度展开框架下,利用轻量级且高效的 U-net 减小模型大小并提高整体性能。此外,从采样模型的先验知识动态地调节网络特征,使单个 SPA-DUN 能够处理任意采样设置,增加可解释性和一般性。在模拟和真实数据集上的广泛实验表明,SPA-DUN不仅可以适用于单个模型的各种采样设置,而且具有惊人的效率和 SOTA 性能。
https://arxiv.org/abs/2307.07291
In this work, we propose a novel approach called Operational Support Estimator Networks (OSENs) for the support estimation task. Support Estimation (SE) is defined as finding the locations of non-zero elements in a sparse signal. By its very nature, the mapping between the measurement and sparse signal is a non-linear operation. Traditional support estimators rely on computationally expensive iterative signal recovery techniques to achieve such non-linearity. Contrary to the convolution layers, the proposed OSEN approach consists of operational layers that can learn such complex non-linearities without the need for deep networks. In this way, the performance of the non-iterative support estimation is greatly improved. Moreover, the operational layers comprise so-called generative \textit{super neurons} with non-local kernels. The kernel location for each neuron/feature map is optimized jointly for the SE task during the training. We evaluate the OSENs in three different applications: i. support estimation from Compressive Sensing (CS) measurements, ii. representation-based classification, and iii. learning-aided CS reconstruction where the output of OSENs is used as prior knowledge to the CS algorithm for an enhanced reconstruction. Experimental results show that the proposed approach achieves computational efficiency and outperforms competing methods, especially at low measurement rates by a significant margin. The software implementation is publicly shared at this https URL.
在本作品中,我们提出了一种名为操作支持估计器网络(osen)的新方法,用于支持估计任务。支持估计(SE)的定义是找到稀疏信号中的非零元素的位置。从特性的角度来看,测量和稀疏信号之间的映射是一种非线性操作。传统的支持估计方法依赖于计算代价高昂的迭代信号恢复技术来实现这种非线性。与卷积层相反,我们提出的osen方法包括操作层,这些操作层不需要深度网络就可以学习这些复杂的非线性特性。通过这种方式,非迭代的支持估计性能得到了极大的改善。此外,操作层由所谓的生成神经元(super neurons)组成的,这些神经元具有非局部Kernel。每个神经元/特征映射的Kernel位置在训练期间 jointly 用于支持估计任务。我们在不同的应用程序中评估了osen:i.从压缩感知测量(CS)测量中进行支持估计,ii.基于表示的分类,以及iii.学习辅助的CS重构,其中osen的输出被用作CS算法的前置知识,以增强重构。实验结果表明,我们提出的方法实现了计算效率,并比竞争方法更有效,特别是在低测量率方面具有显著优势。软件实现在此httpsURL上公开分享。
https://arxiv.org/abs/2307.06065
Deep unfolding network (DUN) that unfolds the optimization algorithm into a deep neural network has achieved great success in compressive sensing (CS) due to its good interpretability and high performance. Each stage in DUN corresponds to one iteration in optimization. At the test time, all the sampling images generally need to be processed by all stages, which comes at a price of computation burden and is also unnecessary for the images whose contents are easier to restore. In this paper, we focus on CS reconstruction and propose a novel Dynamic Path-Controllable Deep Unfolding Network (DPC-DUN). DPC-DUN with our designed path-controllable selector can dynamically select a rapid and appropriate route for each image and is slimmable by regulating different performance-complexity tradeoffs. Extensive experiments show that our DPC-DUN is highly flexible and can provide excellent performance and dynamic adjustment to get a suitable tradeoff, thus addressing the main requirements to become appealing in practice. Codes are available at this https URL.
展开优化算法并将其展开成深度神经网络的 Deep Un unfold Network (DUN) 在压缩感知(CS)方面取得了巨大的成功,因为其良好的解释性和高性能。DUN 的每一阶段对应着优化中的迭代。在测试时,通常需要对所有采样图像进行处理,这需要计算负担,而对于更容易恢复其内容的图像则没有必要。在本文中,我们专注于 CS 重建并提出了一种新型的动态路径控制 Deep Un unfold Network (DPC-DUN)。DPC-DUN 使用我们设计的可控制路径选择器可以动态地选择每个图像的迅速且适当的路径,并通过调节不同的性能-复杂性权衡来减小规模。广泛的实验表明,我们的 DPC-DUN 非常灵活,可以提供出色的性能和动态调整,以获得适当的权衡,从而满足了在实践中变得吸引人的主要要求。代码可在 this https URL 上获取。
https://arxiv.org/abs/2306.16060
Compressive sensing (CS) reconstructs images from sub-Nyquist measurements by solving a sparsity-regularized inverse problem. Traditional CS solvers use iterative optimizers with hand crafted sparsifiers, while early data-driven methods directly learn an inverse mapping from the low-dimensional measurement space to the original image space. The latter outperforms the former, but is restrictive to a pre-defined measurement domain. More recent, deep unrolling methods combine traditional proximal gradient methods and data-driven approaches to iteratively refine an image approximation. To achieve higher accuracy, it has also been suggested to learn both the sampling matrix, and the choice of measurement vectors adaptively. Contrary to the current trend, in this work we hypothesize that a general inverse mapping from a random set of compressed measurements to the image domain exists for a given measurement basis, and can be learned. Such a model is single-shot, non-restrictive and does not parametrize the sampling process. To this end, we propose MOSAIC, a novel compressive sensing framework to reconstruct images given any random selection of measurements, sampled using a fixed basis. Motivated by the uneven distribution of information across measurements, MOSAIC incorporates an embedding technique to efficiently apply attention mechanisms on an encoded sequence of measurements, while dispensing the need to use unrolled deep networks. A range of experiments validate our proposed architecture as a promising alternative for existing CS reconstruction methods, by achieving the state-of-the-art for metrics of reconstruction accuracy on standard datasets.
压缩感知(CS)通过解决稀疏性限制的逆问题,从低 Nyquist 测量值中恢复图像。传统的 CS 解决方法使用迭代优化工具和手工制作的稀疏化器,而早期的数据驱动方法直接学习从低维度测量空间到原始图像空间的逆映射。前者比后者表现更好,但只适用于预定义测量域。更近期,深度展开方法结合了传统的近邻梯度方法和数据驱动方法,以迭代地 refine 图像近似。为了获得更高的精度,也建议自适应地学习采样矩阵和测量向量的选择。与当前趋势相反,在这个研究中,我们假设从一个随机的压缩测量集合到图像域的通用逆映射存在,并且可以学习。这样的模型是一个一次性的,不限制的,并且不需要对采样过程参数化。为此,我们提出了 MOSAIC,一个新型的 CS 恢复框架,使用一个固定的基采样,通过随机选择测量集合来恢复图像。因为测量信息不均衡分布,MOSAIC 采用了嵌入技术,高效应用注意力机制,在编码序列的测量集合上,而不需要展开深层网络。一系列实验验证我们提出的架构作为现有 CS 恢复方法的有前途的替代方案,通过在标准数据集上实现恢复精度的顶级指标。
https://arxiv.org/abs/2306.00906
For solving linear inverse problems, particularly of the type that appear in tomographic imaging and compressive sensing, this paper develops two new approaches. The first approach is an iterative algorithm that minimizers a regularized least squares objective function where the regularization is based on a compound Gaussian prior distribution. The Compound Gaussian prior subsumes many of the commonly used priors in image reconstruction, including those of sparsity-based approaches. The developed iterative algorithm gives rise to the paper's second new approach, which is a deep neural network that corresponds to an "unrolling" or "unfolding" of the iterative algorithm. Unrolled deep neural networks have interpretable layers and outperform standard deep learning methods. This paper includes a detailed computational theory that provides insight into the construction and performance of both algorithms. The conclusion is that both algorithms outperform other state-of-the-art approaches to tomographic image formation and compressive sensing, especially in the difficult regime of low training.
为了解决线性逆问题,特别是出现在磁共振成像和压缩感知中的问题,本文开发了两种新的算法。第一种方法是迭代算法,其最小化的目标是 regularized 最小二乘法 objective function,其中Regularization是基于组合高斯先验分布的。组合高斯先验分布将许多在图像重建中常用的先验包括在内,包括基于密度的先验。开发迭代算法导致本文提出的第二种新算法,这是一种深度神经网络,与迭代算法的“展开”或“展开”对应。展开的深度神经网络具有可解释的层,并比标准深度学习方法表现更好。本文包括详细的计算理论,提供了对两个算法构造和性能的理解。结论是,两个算法在磁共振成像和压缩感知中的表现优于其他先进的方法,特别是在低训练状态下。
https://arxiv.org/abs/2305.11120