Cardiac contraction is a rapid, coordinated process that unfolds across three-dimensional tissue on millisecond timescales. Traditional optical imaging is often inadequate for capturing dynamic cellular structure in the beating heart because of a fundamental trade-off between spatial and temporal resolution. To overcome these limitations, we propose a high-performance computational imaging framework that integrates Compressive Sensing (CS) with Light-Sheet Microscopy (LSM) for efficient, low-phototoxic cardiac imaging. The system performs compressed acquisition of fluorescence signals via random binary mask coding using a Digital Micromirror Device (DMD). We propose a Plug-and-Play (PnP) framework, solved using the alternating direction method of multipliers (ADMM), which flexibly incorporates advanced denoisers, including Tikhonov, Total Variation (TV), and BM3D. To preserve structural continuity in dynamic imaging, we further introduce temporal regularization enforcing smoothness between adjacent z-slices. Experimental results on zebrafish heart imaging under high compression ratios demonstrate that the proposed method successfully reconstructs cellular structures with excellent denoising performance and image clarity, validating the effectiveness and robustness of our algorithm in real-world high-speed, low-light biological imaging scenarios.
https://arxiv.org/abs/2511.03093
Adversarial attacks present a significant threat to modern machine learning systems. Yet, existing detection methods often lack the ability to detect unseen attacks or detect different attack types with a high level of accuracy. In this work, we propose a statistical approach that establishes a detection baseline before a neural network's deployment, enabling effective real-time adversarial detection. We generate a metric of adversarial presence by comparing the behavior of a compressed/uncompressed neural network pair. Our method has been tested against state-of-the-art techniques, and it achieves near-perfect detection across a wide range of attack types. Moreover, it significantly reduces false positives, making it both reliable and practical for real-world applications.
对抗性攻击对现代机器学习系统构成了重大威胁。然而,现有的检测方法通常缺乏检测未见过的攻击或以高精度检测不同类型的攻击的能力。在这项工作中,我们提出了一种统计方法,在神经网络部署之前建立一个检测基线,从而能够进行有效的实时对抗性检测。通过比较压缩和未压缩的一对神经网络的行为,我们生成了一个衡量对抗性存在的指标。我们的方法已经经过最先进的技术测试,并在广泛的攻击类型中实现了近乎完美的检测率。此外,它大大减少了误报,使其既可靠又实用,适合实际应用。
https://arxiv.org/abs/2510.02707
Hyperspectral imaging (HSI) provides rich spatial-spectral information but remains costly to acquire due to hardware limitations and the difficulty of reconstructing three-dimensional data from compressed measurements. Although compressive sensing systems such as CASSI improve efficiency, accurate reconstruction is still challenged by severe degradation and loss of fine spectral details. We propose the Flow-Matching-guided Unfolding network (FMU), which, to our knowledge, is the first to integrate flow matching into HSI reconstruction by embedding its generative prior within a deep unfolding framework. To further strengthen the learned dynamics, we introduce a mean velocity loss that enforces global consistency of the flow, leading to a more robust and accurate reconstruction. This hybrid design leverages the interpretability of optimization-based methods and the generative capacity of flow matching. Extensive experiments on both simulated and real datasets show that FMU significantly outperforms existing approaches in reconstruction quality. Code and models will be available at this https URL.
高光谱成像(HSI)提供了丰富的空间-光谱信息,但由于硬件限制和从压缩测量中重构三维数据的难度,其获取成本仍然较高。尽管诸如CASSI之类的压缩感知系统提高了效率,但准确重建仍受到严重退化和精细光谱细节丢失的挑战。我们提出了Flow-Matching指导下的展层网络(FMU),据我们所知,这是首次将流匹配集成到HSI重构中,在深度解卷积框架内嵌入其生成先验的方法。为了进一步增强学习的动力学,我们引入了平均速度损失,以强制执行流的全局一致性,从而实现更稳健和准确的重建。这种混合设计利用了基于优化方法的可解释性和流匹配的生成能力。在仿真数据集和真实数据集上的广泛实验表明,FMU在重构质量上显著优于现有方法。代码和模型将在以下网址提供:[此链接](请将方括号内的文本替换为实际提供的URL)。
https://arxiv.org/abs/2510.01912
Imaging inverse problems aims to recover high-dimensional signals from undersampled, noisy measurements, a fundamentally ill-posed task with infinite solutions in the null-space of the sensing operator. To resolve this ambiguity, prior information is typically incorporated through handcrafted regularizers or learned models that constrain the solution space. However, these priors typically ignore the task-specific structure of that null-space. In this work, we propose \textit{Non-Linear Projections of the Null-Space} (NPN), a novel class of regularization that, instead of enforcing structural constraints in the image domain, promotes solutions that lie in a low-dimensional projection of the sensing matrix's null-space with a neural network. Our approach has two key advantages: (1) Interpretability: by focusing on the structure of the null-space, we design sensing-matrix-specific priors that capture information orthogonal to the signal components that are fundamentally blind to the sensing process. (2) Flexibility: NPN is adaptable to various inverse problems, compatible with existing reconstruction frameworks, and complementary to conventional image-domain priors. We provide theoretical guarantees on convergence and reconstruction accuracy when used within plug-and-play methods. Empirical results across diverse sensing matrices demonstrate that NPN priors consistently enhance reconstruction fidelity in various imaging inverse problems, such as compressive sensing, deblurring, super-resolution, computed tomography, and magnetic resonance imaging, with plug-and-play methods, unrolling networks, deep image prior, and diffusion models.
图像逆问题的目标是从欠采样且含有噪声的测量中恢复高维信号,这是一个本质上无法确定的问题,在感知算子的零空间中有无限多个解。为了消除这种不确定性,通常通过手工设计的正则化器或学习模型来整合先验信息以约束解的空间范围。然而,这些先验信息往往忽略了该零空间中的任务特定结构。在本文中,我们提出了“非线性零空间投影”(NPN),这是一种新的正则化方法,它不通过图像域施加结构性限制,而是利用神经网络来促进那些位于感知矩阵的低维零空间投影上的解。我们的方法有两大优势: 1. **可解释性**:通过关注零空间结构,我们设计了与感知过程本质盲目的信号成分正交的信息特异于感知矩阵的先验。 2. **灵活性**:NPN能够适应各种逆问题,可以与现有的重建框架兼容,并且是对传统图像域先验的一种补充。我们在插件式方法中提供了关于收敛性和重构准确性的理论保证。 实验结果表明,在多种不同的感知矩阵下,NPN在压缩感知、去模糊化、超分辨率、计算机断层扫描和磁共振成像等各类成像逆问题中的重构保真度上始终有所提升。这些改进适用于插件式方法、展卷网络、深度图像先验以及扩散模型等多种技术框架内。
https://arxiv.org/abs/2510.01608
Self-supervised learning for inverse problems allows to train a reconstruction network from noise and/or incomplete data alone. These methods have the potential of enabling learning-based solutions when obtaining ground-truth references for training is expensive or even impossible. In this paper, we propose a new self-supervised learning strategy devised for the challenging setting where measurements are observed via a single incomplete observation model. We introduce a new definition of equivariance in the context of reconstruction networks, and show that the combination of self-supervised splitting losses and equivariant reconstruction networks results in unbiased estimates of the supervised loss. Through a series of experiments on image inpainting, accelerated magnetic resonance imaging, and compressive sensing, we demonstrate that the proposed loss achieves state-of-the-art performance in settings with highly rank-deficient forward models.
自监督学习在逆问题中的应用使得仅通过噪声和/或不完整数据来训练重建网络成为可能。这些方法有望实现基于学习的解决方案,特别是在获取用于训练的真实参考数据昂贵甚至不可能的情况下。本文中,我们提出了一种新的针对测量值仅能通过单一不完整的观察模型获得这一挑战性设置下的自监督学习策略。我们在重建网络背景下引入了等变性的新定义,并证明了结合自我监督分裂损失和等变重建网络可以得到无偏的监督损失估计。通过对图像修复、加速磁共振成像以及压缩感知进行一系列实验,我们展示了所提出的损失函数在具有高度秩亏前向模型设置中达到了最先进的性能。
https://arxiv.org/abs/2510.00929
Spectral imaging technology has long-faced fundamental challenges in balancing spectral, spatial, and temporal reso- lutions. While compressive sensing-based Coded Aperture Snapshot Spectral Imaging (CASSI) mitigates this trade-off through optical encoding, high compression ratios result in ill-posed reconstruction problems. Traditional model-based methods exhibit limited performance due to reliance on handcrafted inherent image priors, while deep learning approaches are constrained by their black-box nature, which compromises physical interpretability. To address these limitations, we propose a dual-camera CASSI reconstruction framework that integrates total variation (TV) subgradient theory. By es- tablishing an end-to-end SD-CASSI mathematical model, we reduce the computational complexity of solving the inverse problem and provide a mathematically well-founded framework for analyzing multi-camera systems. A dynamic regular- ization strategy is introduced, incorporating normalized gradient constraints from RGB/panchromatic-derived reference images, which constructs a TV subgradient similarity function with strict convex optimization guarantees. Leveraging spatial priors from auxiliary cameras, an adaptive reference generation and updating mechanism is designed to provide subgradient guidance. Experimental results demonstrate that the proposed method effectively preserves spatial-spectral structural consistency. The theoretical framework establishes an interpretable mathematical foundation for computational spectral imaging, demonstrating robust performance across diverse reconstruction scenarios. The source code is available at this https URL.
光谱成像技术长期以来在平衡光谱、空间和时间分辨率方面面临基本挑战。虽然基于压缩感知的编码孔径快照光谱成像(CASSI)通过光学编码缓解了这种权衡,但高压缩比导致重建问题变得不适定。传统模型驱动的方法由于依赖于手工设计的先验图像特征表现出有限性能,而深度学习方法则受限于其黑箱特性,从而牺牲了物理可解释性。为了解决这些局限性,我们提出了一种双摄像头CASSI重建框架,集成了全变差(TV)次梯度理论。通过建立端到端SD-CASSI数学模型,我们减少了求解逆问题的计算复杂度,并提供了一个分析多相机系统的数学基础框架。引入了动态正则化策略,该策略结合来自RGB/全色图像导出参考图象的归一化梯度约束,构建具有严格凸优化保证的TV次梯度相似函数。利用辅助摄像头的空间先验信息,设计了一种自适应参考生成和更新机制以提供次梯度指导。实验结果表明所提出的方法在保持空间-光谱结构一致性方面效果显著。理论框架为计算光谱成像建立了可解释的数学基础,并且在各种重建场景中表现出强大的性能。源代码可在该链接获取:[这里插入链接]。
https://arxiv.org/abs/2509.10897
We explore the connection between Plug-and-Play (PnP) methods and Denoising Diffusion Implicit Models (DDIM) for solving ill-posed inverse problems, with a focus on single-pixel imaging. We begin by identifying key distinctions between PnP and diffusion models-particularly in their denoising mechanisms and sampling procedures. By decoupling the diffusion process into three interpretable stages: denoising, data consistency enforcement, and sampling, we provide a unified framework that integrates learned priors with physical forward models in a principled manner. Building upon this insight, we propose a hybrid data-consistency module that linearly combines multiple PnP-style fidelity terms. This hybrid correction is applied directly to the denoised estimate, improving measurement consistency without disrupting the diffusion sampling trajectory. Experimental results on single-pixel imaging tasks demonstrate that our method achieves better reconstruction quality.
我们探讨了插件即玩(Plug-and-Play,PnP)方法与去噪扩散隐式模型(Denoising Diffusion Implicit Models,DDIM)在解决病态逆问题中的联系,特别是单像素成像。首先,我们识别出PnP和扩散模型之间的一些关键区别,特别是在它们的去噪机制和采样程序方面。通过将扩散过程分解为三个可解释的阶段:去噪、数据一致性执行和采样,我们提供了一个统一的框架,在这个框架中以一种原则性的方式整合了学习到的先验与物理前向模型。在此洞察的基础上,我们提出了一种混合的数据一致性模块,该模块通过线性组合多个PnP风格的一致性项来直接应用于去噪估计上。这种混合校正是在不干扰扩散采样轨迹的情况下提高了测量一致性。实验结果表明,在单像素成像任务中,我们的方法实现了更好的重建质量。
https://arxiv.org/abs/2509.09365
Digital cameras consume ~0.1 microjoule per pixel to capture and encode video, resulting in a power usage of ~20W for a 4K sensor operating at 30 fps. Imagining gigapixel cameras operating at 100-1000 fps, the current processing model is unsustainable. To address this, physical layer compressive measurement has been proposed to reduce power consumption per pixel by 10-100X. Video Snapshot Compressive Imaging (SCI) introduces high frequency modulation in the optical sensor layer to increase effective frame rate. A commonly used sampling strategy of video SCI is Random Sampling (RS) where each mask element value is randomly set to be 0 or 1. Similarly, image inpainting (I2P) has demonstrated that images can be recovered from a fraction of the image pixels. Inspired by I2P, we propose Ultra-Sparse Sampling (USS) regime, where at each spatial location, only one sub-frame is set to 1 and all others are set to 0. We then build a Digital Micro-mirror Device (DMD) encoding system to verify the effectiveness of our USS strategy. Ideally, we can decompose the USS measurement into sub-measurements for which we can utilize I2P algorithms to recover high-speed frames. However, due to the mismatch between the DMD and CCD, the USS measurement cannot be perfectly decomposed. To this end, we propose BSTFormer, a sparse TransFormer that utilizes local Block attention, global Sparse attention, and global Temporal attention to exploit the sparsity of the USS measurement. Extensive results on both simulated and real-world data show that our method significantly outperforms all previous state-of-the-art algorithms. Additionally, an essential advantage of the USS strategy is its higher dynamic range than that of the RS strategy. Finally, from the application perspective, the USS strategy is a good choice to implement a complete video SCI system on chip due to its fixed exposure time.
数字相机每捕获和编码一个视频像素大约消耗0.1微焦耳的能量,这意味着对于30帧/秒运行的4K传感器来说,其功耗约为20瓦。设想一下千兆像素级别的摄像机以100到1000帧/秒的速度工作时,现有的处理模型显然不可持续。为了解决这个问题,人们提出了在物理层面上采用压缩测量来将每个像素的能量消耗减少10至100倍的方案。 视频快照压缩成像(Video Snapshot Compressive Imaging, SCI)通过在光学传感器层面引入高频调制技术,提高了有效帧率。一种常见的视频SCI采样策略是随机采样(Random Sampling, RS),其中每个掩模元素值被随机设置为0或1。同样地,图像修复(Image Inpainting, I2P)展示了仅从一部分像素中就可以恢复出完整图像的可能性。 受到I2P的启发,我们提出了一种超稀疏采样(Ultra-Sparse Sampling, USS)策略,在这一策略下,每个空间位置只有一个子帧被设置为1,其余所有都设为0。随后,我们构建了一个基于数字微镜设备(Digital Micromirror Device, DMD)的编码系统来验证USS策略的有效性。理想情况下,我们可以将USS测量分解成子测量,并利用I2P算法恢复高速图像帧。然而,由于DMD和CCD之间的不匹配问题,USS测量无法完美地被分解。 为此,我们提出了BSTFormer(Block、Sparse、Temporal Transformer),一种稀疏Transformer,它使用局部块注意机制、全局稀疏注意机制以及全局时间注意机制来利用USS测量的稀疏性。在模拟和真实世界数据集上的广泛实验结果表明,我们的方法显著优于所有先前的最佳算法。 此外,USS策略的一个关键优势是其动态范围比RS策略更大。从应用角度来看,由于固定曝光时间的特点,USS策略是一个实施完整视频SCI系统芯片的良好选择。
https://arxiv.org/abs/2509.08228
Deep networks have achieved remarkable success in image compressed sensing (CS) task, namely reconstructing a high-fidelity image from its compressed measurement. However, existing works are deficient inincoherent compressed measurement at sensing phase and implicit measurement representations at reconstruction phase, limiting the overall performance. In this work, we answer two questions: 1) how to improve the measurement incoherence for decreasing the ill-posedness; 2) how to learn informative representations from measurements. To this end, we propose a novel asymmetric Kronecker CS (AKCS) model and theoretically present its better incoherence than previous Kronecker CS with minimal complexity increase. Moreover, we reveal that the unfolding networks' superiority over non-unfolding ones result from sufficient gradient descents, called explicit measurement representations. We propose a measurement-aware cross attention (MACA) mechanism to learn implicit measurement representations. We integrate AKCS and MACA into widely-used unfolding architecture to get a measurement-enhanced unfolding network (MEUNet). Extensive experiences demonstrate that our MEUNet achieves state-of-the-art performance in reconstruction accuracy and inference speed.
深度网络在图像压缩感知(CS)任务中取得了显著的成功,即从其压缩测量值重构出高质量的图像。然而,现有的研究工作在传感阶段缺乏不相干的压缩测量,在重建阶段也未能明确表示测量结果,这限制了整体性能的表现。本文旨在回答两个问题:1)如何改进测量的不相关性以减少不适定性;2)如何从测量数据中学习出具有信息量的表示。 为此,我们提出了一个新颖的非对称克罗内克压缩感知(AKCS)模型,并从理论上证明了其比之前的克罗内克压缩感知方法在复杂度最小增加的情况下具备更好的不相关性。此外,我们揭示了解折叠网络优于非解折叠网络的原因在于充分的梯度下降过程,即显式的测量表示。 为了学习隐含的测量表示,我们提出了一个基于测量的认知交叉注意力(MACA)机制。我们将AKCS和MACA整合到广泛使用的解折叠架构中,得到了一种增强测量信息的解折叠网络(MEUNet)。大量的实验经验表明,我们的MEUNet在重建精度和推理速度方面达到了最先进的性能水平。
https://arxiv.org/abs/2508.09528
Sound speed profiles (SSPs) are essential parameters underwater that affects the propagation mode of underwater signals and has a critical impact on the energy efficiency of underwater acoustic communication and accuracy of underwater acoustic positioning. Traditionally, SSPs can be obtained by matching field processing (MFP), compressive sensing (CS), and deep learning (DL) methods. However, existing methods mainly rely on on-site underwater sonar observation data, which put forward strict requirements on the deployment of sonar observation systems. To achieve high-precision estimation of sound velocity distribution in a given sea area without on-site underwater data measurement, we propose a multi-modal data-fusion generative adversarial network model with residual attention block (MDF-RAGAN) for SSP construction. To improve the model's ability for capturing global spatial feature correlations, we embedded the attention mechanisms, and use residual modules for deeply capturing small disturbances in the deep ocean sound velocity distribution caused by changes of SST. Experimental results on real open dataset show that the proposed model outperforms other state-of-the-art methods, which achieves an accuracy with an error of less than 0.3m/s. Specifically, MDF-RAGAN not only outperforms convolutional neural network (CNN) and spatial interpolation (SITP) by nearly a factor of two, but also achieves about 65.8\% root mean square error (RMSE) reduction compared to mean profile, which fully reflects the enhancement of overall profile matching by multi-source fusion and cross-modal attention.
声速剖面(SSPs)是水下环境中的关键参数,它影响着水下信号的传播模式,并对水下声学通信的能量效率和水下声定位的准确性有着至关重要的影响。传统上,可以通过现场处理匹配(MFP)、压缩感知(CS)以及深度学习(DL)方法来获取SSPs。然而,现有的大多数方法主要依赖于现场水下声呐观测数据,这对声呐观测系统的部署提出了严格的要求。为了在不进行现场测量的情况下,在给定海域内实现高精度的声速分布估计,我们提出了一种多模态数据融合对抗生成网络模型(MDF-RAGAN),该模型采用残差注意力模块构建SSP。为增强模型捕捉全局空间特征相关性的能力,我们在模型中嵌入了注意机制,并使用残差模块来深入捕获由于海表温度变化而引起的深海水声速分布的微小扰动。 在真实开放数据集上的实验结果表明,所提出的模型优于其他最先进的方法,在误差小于0.3米/秒的情况下达到了更高的精度。具体而言,MDF-RAGAN不仅比卷积神经网络(CNN)和空间插值法(SITP)高出近两倍的表现,还与平均剖面相比减少了约65.8%的均方根误差(RMSE),这充分反映了多源融合及跨模态注意力对整体剖面对齐增强的效果。
https://arxiv.org/abs/2507.11812
Compressive imaging (CI) reconstruction, such as snapshot compressive imaging (SCI) and compressive sensing magnetic resonance imaging (MRI), aims to recover high-dimensional images from low-dimensional compressed measurements. This process critically relies on learning an accurate representation of the underlying high-dimensional image. However, existing unsupervised representations may struggle to achieve a desired balance between representation ability and efficiency. To overcome this limitation, we propose Tensor Decomposed multi-resolution Grid encoding (GridTD), an unsupervised continuous representation framework for CI reconstruction. GridTD optimizes a lightweight neural network and the input tensor decomposition model whose parameters are learned via multi-resolution hash grid encoding. It inherently enjoys the hierarchical modeling ability of multi-resolution grid encoding and the compactness of tensor decomposition, enabling effective and efficient reconstruction of high-dimensional images. Theoretical analyses for the algorithm's Lipschitz property, generalization error bound, and fixed-point convergence reveal the intrinsic superiority of GridTD as compared with existing continuous representation models. Extensive experiments across diverse CI tasks, including video SCI, spectral SCI, and compressive dynamic MRI reconstruction, consistently demonstrate the superiority of GridTD over existing methods, positioning GridTD as a versatile and state-of-the-art CI reconstruction method.
压缩成像(CI)重建技术,如快照压缩成像(SCI)和压缩感知磁共振成像(MRI),旨在从低维压缩测量中恢复高维度图像。这一过程关键在于准确学习底层高维图像的表示形式。然而,现有的无监督表示方法可能难以在表征能力和效率之间取得理想的平衡。为克服这一限制,我们提出了一种针对CI重建的无监督连续表示框架——张量分解多分辨率网格编码(GridTD)。GridTD通过多层次哈希网格编码优化轻量级神经网络和输入张量分解模型,并从中学习参数。它天然具备多层次网格编码的层次建模能力和张量分解的紧凑性,能够有效地实现高维图像重建。 算法的Lipschitz性质、泛化误差界限以及固定点收敛性的理论分析揭示了GridTD与现有连续表示模型相比所具有的内在优越性。在包括视频SCI、光谱SCI和压缩动态MRI重建等多样的CI任务中进行的广泛实验,一致地证明了GridTD优于现有的方法,并将GridTD定位为一种通用且先进的CI重建技术。
https://arxiv.org/abs/2507.07707
Many applications have been identified which require the deployment of large-scale low-power wireless sensor networks. Some of the deployment environments, however, impose harsh operation conditions due to intense cross-technology interference, extreme weather conditions (heavy rainfall, excessive heat, etc.), or rough motion, thereby affecting the quality and predictability of the wireless links the nodes establish. In localization tasks, these conditions often lead to significant errors in estimating the position of target nodes. Motivated by the practical deployments of sensors on the surface of different water bodies, we address the problem of identifying susceptible nodes and robustly estimating their positions. We formulate these tasks as a compressive sensing problem and propose algorithms for both node identification and robust estimation. Additionally, we design an optimal anchor configuration to maximize the robustness of the position estimation task. Our numerical results and comparisons with competitive methods demonstrate that the proposed algorithms achieve both objectives with a modest number of anchors. Since our method relies only on target-to-anchor distances, it is broadly applicable and yields resilient, robust localization.
许多应用场景已确定需要部署大规模低功耗无线传感器网络。然而,一些部署环境由于存在强烈的跨技术干扰、极端天气条件(如暴雨和高温)或剧烈的运动等因素,给节点建立无线链路的质量和可预测性带来了挑战,从而导致定位任务中目标节点位置估计出现显著误差。 受在不同水体表面部署传感器的实际应用启发,我们致力于识别易受影响的节点并稳健地估算其位置。我们将这些问题转化为压缩感知问题,并提出了用于节点识别和稳健估计的算法。此外,我们还设计了一种最优锚点配置方案以最大化定位任务的鲁棒性。我们的数值结果与竞争方法相比表明,所提出的算法能够使用少量锚点实现这两个目标。 由于本方法仅依赖于目标到锚点的距离信息,它具有广泛适用性和提供稳健、精确的定位能力的特点。
https://arxiv.org/abs/2507.03856
Diffusion models have achieved remarkable success in imaging inverse problems owing to their powerful generative capabilities. However, existing approaches typically rely on models trained for specific degradation types, limiting their generalizability to various degradation scenarios. To address this limitation, we propose a zero-shot framework capable of handling various imaging inverse problems without model retraining. We introduce a likelihood-guided noise refinement mechanism that derives a closed-form approximation of the likelihood score, simplifying score estimation and avoiding expensive gradient computations. This estimated score is subsequently utilized to refine the model-predicted noise, thereby better aligning the restoration process with the generative framework of diffusion models. In addition, we integrate the Denoising Diffusion Implicit Models (DDIM) sampling strategy to further improve inference efficiency. The proposed mechanism can be applied to both optimization-based and sampling-based schemes, providing an effective and flexible zero-shot solution for imaging inverse problems. Extensive experiments demonstrate that our method achieves superior performance across multiple inverse problems, particularly in compressive sensing, delivering high-quality reconstructions even at an extremely low sampling rate (5%).
扩散模型在成像逆问题中取得了显著的成功,这得益于其强大的生成能力。然而,现有的方法通常依赖于针对特定退化类型训练的模型,从而限制了它们在各种退化场景中的泛化能力。为了解决这一局限性,我们提出了一种零样本框架,该框架能够在不重新训练模型的情况下处理多种成像逆问题。我们引入了一个基于似然性的噪声精炼机制,它可以推导出一个闭式形式的似然分数近似,简化了分数估计过程,并避免了昂贵的梯度计算。这个估计出来的分数随后被用来改进模型预测的噪声,从而更好地使恢复过程与扩散模型的生成框架相一致。此外,我们将去噪扩散隐式模型(DDIM)采样策略集成进来,进一步提高了推理效率。所提出的机制既可以应用于基于优化的方法也可以用于基于抽样的方法,为成像逆问题提供了一个有效且灵活的零样本解决方案。广泛的实验表明,我们的方法在多个逆问题上实现了优越的表现,特别是在压缩感知领域,在极低的采样率(5%)下也能实现高质量的重建。
https://arxiv.org/abs/2506.13391
The radio map represents the spatial distribution of spectrum resources within a region, supporting efficient resource allocation and interference mitigation. However, it is difficult to construct a dense radio map as a limited number of samples can be measured in practical scenarios. While existing works have used deep learning to estimate dense radio maps from sparse samples, they are hard to integrate with the physical characteristics of the radio map. To address this challenge, we cast radio map estimation as the sparse signal recovery problem. A physical propagation model is further incorporated to decompose the problem into multiple factor optimization sub-problems, thereby reducing recovery complexity. Inspired by the existing compressive sensing methods, we propose the Radio Deep Unfolding Network (RadioDUN) to unfold the optimization process, achieving adaptive parameter adjusting and prior fitting in a learnable manner. To account for the radio propagation characteristics, we develop a dynamic reweighting module (DRM) to adaptively model the importance of each factor for the radio map. Inspired by the shadowing factor in the physical propagation model, we integrate obstacle-related factors to express the obstacle-induced signal stochastic decay. The shadowing loss is further designed to constrain the factor prediction and act as a supplementary supervised objective, which enhances the performance of RadioDUN. Extensive experiments have been conducted to demonstrate that the proposed method outperforms the state-of-the-art methods. Our code will be made publicly available upon publication.
无线电图表示了某个区域内频谱资源的空间分布,支持高效的资源分配和干扰抑制。然而,在实际场景中由于可测量的样本数量有限,构建密集型无线电图非常困难。现有的研究工作使用深度学习从稀疏样本估算出密集型无线电图,但难以与无线电图的物理特性相结合。为了解决这一挑战,我们将无线电图估计视为稀疏信号恢复问题。通过进一步结合物理传播模型,将该问题分解成多个因子优化子问题,从而降低恢复复杂度。 受到现有的压缩感知方法的启发,我们提出了无线电深度递归网络(RadioDUN),以展开优化过程,在可学习的方式中实现自适应参数调整和先验拟合。为了考虑无线电传播特性,我们开发了一个动态重加权模块(DRM)来自适应地建模无线电图中每个因子的重要性。受到物理传播模型中的阴影因素的启发,我们将与障碍物相关的因素整合进来以表示由障碍物引起的信号随机衰减。 设计了阴影损失函数进一步约束因子预测,并作为补充监督目标,增强了RadioDUN的表现性能。我们进行了大量的实验来证明所提出的方法优于当前最先进的方法。我们的代码将在发表后公开发布。
https://arxiv.org/abs/2506.08418
Hyperspectral cameras face harsh trade-offs between spatial, spectral, and temporal resolution in an inherently low-photon regime. Computational imaging systems break through these trade-offs with compressive sensing, but require complex optics and/or extensive compute. We present Spectrum from Defocus (SfD), a chromatic focal sweep method that recovers state-of-the-art hyperspectral images with a small system of off-the-shelf optics and < 1 second of compute. Our camera uses two lenses and a grayscale sensor to preserve nearly all incident light in a chromatically-aberrated focal stack. Our physics-based iterative algorithm efficiently demixes, deconvolves, and denoises the blurry grayscale focal stack into a sharp spectral image. The combination of photon efficiency, optical simplicity, and physical modeling makes SfD a promising solution for fast, compact, interpretable hyperspectral imaging.
高光谱相机在低光子环境下面临着空间分辨率、光谱分辨率和时间分辨率之间的严峻权衡。计算成像系统通过压缩感知突破了这些限制,但需要复杂的光学设备和/或大量的计算资源。我们提出了“从脱焦中恢复光谱(SfD)”方法,这是一种色差焦点扫描技术,仅使用一套现成的光学元件及不到1秒的计算时间即可获取最先进的高光谱图像。 我们的相机采用两片透镜和一个灰度传感器来保留几乎所有的入射光线,并将它们存储在一个具有色差的焦点堆栈中。我们基于物理原理的迭代算法可以高效地对模糊的灰度焦点堆栈进行解混、反卷积以及去噪,从而生成清晰的光谱图像。 这种方法在光子效率、光学设计简洁性及物理模型应用方面的结合使其成为快速、紧凑且易于理解的高光谱成像解决方案。
https://arxiv.org/abs/2503.20184
Scene-aware Adaptive Compressive Sensing (ACS) has attracted significant interest due to its promising capability for efficient and high-fidelity acquisition of scene images. ACS typically prescribes adaptive sampling allocation (ASA) based on previous samples in the absence of ground truth. However, when confronting unknown scenes, existing ACS methods often lack accurate judgment and robust feedback mechanisms for ASA, thus limiting the high-fidelity sensing of the scene. In this paper, we introduce a Sampling Innovation-Based ACS (SIB-ACS) method that can effectively identify and allocate sampling to challenging image reconstruction areas, culminating in high-fidelity image reconstruction. An innovation criterion is proposed to judge ASA by predicting the decrease in image reconstruction error attributable to sampling increments, thereby directing more samples towards regions where the reconstruction error diminishes significantly. A sampling innovation-guided multi-stage adaptive sampling (AS) framework is proposed, which iteratively refines the ASA through a multi-stage feedback process. For image reconstruction, we propose a Principal Component Compressed Domain Network (PCCD-Net), which efficiently and faithfully reconstructs images under AS scenarios. Extensive experiments demonstrate that the proposed SIB-ACS method significantly outperforms the state-of-the-art methods in terms of image reconstruction fidelity and visual effects. Codes are available at this https URL.
场景感知自适应压缩传感(ACS)因其能够高效且高质量地获取场景图像的潜力而引起了广泛的关注。在没有真实情况数据的情况下,ACS通常基于先前样本规定自适应采样分配(ASA)。然而,在面对未知场景时,现有的ACS方法往往难以准确判断并缺乏稳健的反馈机制来改进ASA,从而限制了对场景的高保真度感知。 本文介绍了一种基于采样创新的ACS(SIB-ACS)方法,该方法能够有效识别和优先分配采样到那些具有挑战性的图像重建区域上,最终实现高质量的图像重建。文中提出一个创新标准来判断ASA,通过预测由于增加采样而导致的图像重建误差减少程度,从而将更多的样本集中于重建误差显著降低的区域。 同时提出了基于采样创新引导的多阶段自适应采样(AS)框架,在这个框架下,通过多次迭代反馈过程不断优化ASA。在图像重建方面,我们提出了一种主成分压缩域网络(PCCD-Net),该网络能够高效且准确地在自适应采样的场景中重建图像。 广泛的实验表明,所提出的SIB-ACS方法在图像重建保真度和视觉效果上均显著优于当前最先进的方法。代码可在提供的链接处获取。
https://arxiv.org/abs/2503.13241
Recently, Deep Unfolding Networks (DUNs) have achieved impressive reconstruction quality in the field of image Compressive Sensing (CS) by unfolding iterative optimization algorithms into neural networks. The reconstruction quality of DUNs depends on the learned prior knowledge, so introducing stronger prior knowledge can further improve reconstruction quality. On the other hand, pre-trained diffusion models contain powerful prior knowledge and have a solid theoretical foundation and strong scalability, but it requires a large number of iterative steps to achieve reconstruction. In this paper, we propose to use the powerful prior knowledge of pre-trained diffusion model in DUNs to achieve high-quality reconstruction with less steps for image CS. Specifically, we first design an iterative optimization algorithm named Diffusion Message Passing (DMP), which embeds a pre-trained diffusion model into each iteration process of DMP. Then, we deeply unfold the DMP algorithm into a neural network named DMP-DUN. The proposed DMP-DUN can use lightweight neural networks to achieve mapping from measurement data to the intermediate steps of the reverse diffusion process and directly approximate the divergence of the diffusion model, thereby further improving reconstruction efficiency. Extensive experiments show that our proposed DMP-DUN achieves state-of-the-art performance and requires at least only 2 steps to reconstruct the image. Codes are available at this https URL.
最近,深度展开网络(DUNs)通过将迭代优化算法展开放到神经网络中,在图像压缩感知(CS)领域实现了令人印象深刻的重建质量。DUNs的重建质量依赖于所学习的先验知识,因此引入更强的先验知识可以进一步提高重建质量。另一方面,预训练扩散模型含有强大的先验知识,并具有坚实的理论基础和很强的扩展性,但需要大量的迭代步骤才能实现重建。在本文中,我们提出将预训练扩散模型的强大先验知识用于DUNs,以通过较少的步骤实现高质量的图像CS重建。 具体来说,我们首先设计了一个名为扩散信息传递(DMP)的迭代优化算法,在每次DMP迭代过程中嵌入一个预训练的扩散模型。然后,我们将DMP算法深度展开成一个称为DMP-DUN的神经网络。所提出的DMP-DUN可以使用轻量级的神经网络来实现从测量数据到逆向扩散过程中间步骤的映射,并直接逼近扩散模型的散度,从而进一步提高重建效率。 广泛的实验表明,我们提出的方法DMP-DUN达到了最先进的性能,并且只需要最少2步就能重建图像。代码可在该链接获取:[请在此处插入实际URL]。
https://arxiv.org/abs/2503.08429
In this work we study the behavior of the forward-backward (FB) algorithm when the proximity operator is replaced by a sub-iterative procedure to approximate a Gaussian denoiser, in a Plug-and-Play (PnP) fashion. In particular, we consider both analysis and synthesis Gaussian denoisers within a dictionary framework, obtained by unrolling dual-FB iterations or FB iterations, respectively. We analyze the associated minimization problems as well as the asymptotic behavior of the resulting FB-PnP iterations. In particular, we show that the synthesis Gaussian denoising problem can be viewed as a proximity operator. For each case, analysis and synthesis, we show that the FB-PnP algorithms solve the same problem whether we use only one or an infinite number of sub-iteration to solve the denoising problem at each iteration. To this aim, we show that each "one sub-iteration" strategy within the FB-PnP can be interpreted as a primal-dual algorithm when a warm-restart strategy is used. We further present similar results when using a Moreau-Yosida smoothing of the global problem, for an arbitrary number of sub-iterations. Finally, we provide numerical simulations to illustrate our theoretical results. In particular we first consider a toy compressive sensing example, as well as an image restoration problem in a deep dictionary framework.
在这项工作中,我们研究了在插件式(PnP)框架下用子迭代过程近似高斯去噪器时前向后向(FB)算法的行为。特别地,在字典框架内,我们将分析和综合高斯去噪器分别通过展开对偶-FB迭代或FB迭代获得。我们分析了相关的最小化问题以及由此产生的FB-PnP迭代的渐进行为。特别是,我们展示了合成高斯降噪问题可以被视为邻近算子。对于每种情况,无论是分析还是综合,我们都证明了无论是在每次迭代中只使用一次还是无限次子迭代来解决去噪问题,FB-PnP算法都会求解相同的问题。为此,我们展示了在使用热启动策略时,FB-PnP中的“单次子迭代”策略可以被解释为一种原始对偶算法。此外,当使用任意数量的子迭代进行全局问题的Moreau-Yosida平滑处理时,我们也提供了类似的结果。最后,我们通过数值模拟来说明我们的理论结果。特别地,首先考虑了一个玩具压缩感知示例以及深度字典框架下的图像恢复问题。
https://arxiv.org/abs/2411.13276
Compressive sensing (CS), acquiring and reconstructing signals below the Nyquist rate, has great potential in image and video acquisition to exploit data redundancy and greatly reduce the amount of sampled data. To further reduce the sampled data while keeping the video quality, this paper explores the temporal redundancy in video CS and proposes a block based adaptive compressive sensing framework with a sampling rate (SR) control strategy. To avoid redundant compression of non-moving regions, we first incorporate moving block detection between consecutive frames, and only transmit the measurements of moving blocks. The non-moving regions are reconstructed from the previous frame. In addition, we propose a block storage system and a dynamic threshold to achieve adaptive SR allocation to each frame based on the area of moving regions and target SR for controlling the average SR within the target SR. Finally, to reduce blocking artifacts and improve reconstruction quality, we adopt a cooperative reconstruction of the moving and non-moving blocks by referring to the measurements of the non-moving blocks from the previous frame. Extensive experiments have demonstrated that this work is able to control SR and obtain better performance than existing works.
压缩感知(CS),即在低于奈奎斯特率的情况下获取和重建信号,具有极大的潜力通过利用数据冗余来大幅度减少采样数据量,在图像和视频采集方面尤为突出。为了进一步减少采样数据同时保持视频质量,本文探索了视频CS中的时间冗余,并提出了一种基于块的自适应压缩感知框架,以及一种采样率(SR)控制策略。为了避免对非移动区域进行重复压缩,我们首先在连续帧之间加入了移动块检测功能,仅传输移动块的测量值。非移动区域则从前一帧中重建出来。此外,我们提出了一种块存储系统和动态阈值方法,根据移动区域的面积以及目标SR来实现对每帧自适应分配采样率,从而控制整体平均SR维持在目标SR水平内。最后,为了减少阻塞效应并提高重建质量,我们采用了通过参考前一帧中非移动块测量值的合作式移动和非移动区块重建方法。大量的实验表明,这项工作能够有效控制采样率,并且相比现有的研究取得了更好的性能表现。
https://arxiv.org/abs/2411.10200
Deep Neural Networks (DNNs) are well-known to act as over-parameterized deep image priors (DIP) that regularize various image inverse problems. Meanwhile, researchers also proposed extremely compact, under-parameterized image priors (e.g., deep decoder) that are strikingly competent for image restoration too, despite a loss of accuracy. These two extremes push us to think whether there exists a better solution in the middle: between over- and under-parameterized image priors, can one identify "intermediate" parameterized image priors that achieve better trade-offs between performance, efficiency, and even preserving strong transferability? Drawing inspirations from the lottery ticket hypothesis (LTH), we conjecture and study a novel "lottery image prior" (LIP) by exploiting DNN inherent sparsity, stated as: given an over-parameterized DNN-based image prior, it will contain a sparse subnetwork that can be trained in isolation, to match the original DNN's performance when being applied as a prior to various image inverse problems. Our results validate the superiority of LIPs: we can successfully locate the LIP subnetworks from over-parameterized DIPs at substantial sparsity ranges. Those LIP subnetworks significantly outperform deep decoders under comparably compact model sizes (by often fully preserving the effectiveness of their over-parameterized counterparts), and they also possess high transferability across different images as well as restoration task types. Besides, we also extend LIP to compressive sensing image reconstruction, where a pre-trained GAN generator is used as the prior (in contrast to untrained DIP or deep decoder), and confirm its validity in this setting too. To our best knowledge, this is the first time that LTH is demonstrated to be relevant in the context of inverse problems or image priors.
深度神经网络(DNNs)作为过度参数化的深层图像先验(DIP),已知能够正则化各种图像逆问题。与此同时,研究者也提出了极其紧凑、参数不足的图像先验(例如:深解码器),尽管在准确性上有所损失,但它们对图像恢复同样表现出色。这两种极端促使我们思考是否存在一个更好的中间解决方案:在过度和参数不足的图像先验之间,能否找到实现性能、效率以及保持强迁移能力之间更好平衡的“中间”参数化图像先验?借鉴彩票假设(LTH),我们提出了并研究了一种新颖的“彩票图像先验”(LIP)概念,通过利用DNN内在稀疏性来定义:给定一个基于过度参数化的DNN的图像先验,它将包含一个可以单独训练的稀疏子网络,在应用于各种图像逆问题时可匹配原始DNN的表现。我们的结果验证了LIP的优越性:我们可以在显著的稀疏范围内成功地从过度参数化的DIP中定位到LIP子网络。这些LIP子网络在与深解码器相当紧凑的模型尺寸下表现明显更优(通常完全保留其过度参数化对应物的有效性),并且它们还具备跨不同图像和恢复任务类型的高迁移能力。此外,我们还将LIP扩展到了压缩感知图像重建中,在这里使用预训练的GAN生成器作为先验(与未经过训练的DIP或深解码器相对比),并在这种情况下也证实了其有效性。据我们所知,这是首次在逆问题或图像先验背景下证明LTH的相关性。
https://arxiv.org/abs/2410.24187