Balancing convergence speed, generalization capability, and computational efficiency remains a core challenge in deep learning optimization. First-order gradient descent methods, epitomized by stochastic gradient descent (SGD) and Adam, serve as the cornerstone of modern training pipelines. However, large-scale model training, stringent differential privacy requirements, and distributed learning paradigms expose critical limitations in these conventional approaches regarding privacy protection and memory efficiency. To mitigate these bottlenecks, researchers explore second-order optimization techniques to surpass first-order performance ceilings, while zeroth-order methods reemerge to alleviate memory constraints inherent to large-scale training. Despite this proliferation of methodologies, the field lacks a cohesive framework that unifies underlying principles and delineates application scenarios for these disparate approaches. In this work, we retrospectively analyze the evolutionary trajectory of deep learning optimization algorithms and present a comprehensive empirical evaluation of mainstream optimizers across diverse model architectures and training scenarios. We distill key emerging trends and fundamental design trade-offs, pinpointing promising directions for future research. By synthesizing theoretical insights with extensive empirical evidence, we provide actionable guidance for designing next-generation highly efficient, robust, and trustworthy optimization methods. The code is available at this https URL.
https://arxiv.org/abs/2604.12968
In this article, we propose the optimization of the resolution of time-frequency atoms and the regularization of fitting models to obtain better representations of heart sound signals. This is done by evaluating the classification performance of deep learning (DL) networks in discriminating five heart valvular conditions based on a new class of time-frequency feature matrices derived from the fitting models. We inspect several combinations of resolution and regularization, and the optimal one is that provides the highest performance. To this end, a fitting model is obtained based on a heart sound signal and an overcomplete dictionary of Gabor atoms using elastic net regularization of linear models. We consider two different DL architectures, the first mainly consisting of a 1D convolutional neural network (CNN) layer and a long short-term memory (LSTM) layer, while the second is composed of 1D and 2D CNN layers followed by an LSTM layer. The networks are trained with two algorithms, namely stochastic gradient descent with momentum (SGDM) and adaptive moment (ADAM). Extensive experimentation has been conducted using a database containing heart sound signals of five heart valvular conditions. The best classification accuracy of $98.95\%$ is achieved with the second architecture when trained with ADAM and feature matrices derived from optimal models obtained with a Gabor dictionary consisting of atoms with high-time low-frequency resolution and imposing sparsity on the models.
https://arxiv.org/abs/2604.12483
Accurate distance estimation from monocular cameras is essential for intelligent monitoring systems. In many deployments, image coordinates are mapped to ground positions using planar homographies initialized by manual selection of corresponding regions. Small inaccuracies in this initialization propagate into systematic distance distortions. This paper derives an explicit relationship between homography perturbations and the resulting distance error, showing that the error grows approximately quadratically with the true distance from the camera. Based on this model, two simple correction strategies are evaluated: regression-based estimation of the quadratic error function and direct optimization of the homography via coordinate-based gradient descent. A large-scale simulation study with more than 19 million test samples demonstrates that regression achieves higher peak accuracy when the model is reliably fitted, whereas gradient descent provides greater robustness against poor initial calibration. This suggests that improving geometric calibration may yield greater performance gains than increasing model complexity in many practical systems.
https://arxiv.org/abs/2604.10805
Science is widely regarded as humanity's most reliable method for uncovering truths about the natural world. Yet the \emph{trajectory} of scientific discovery is rarely examined as an optimization problem in its own right. This paper argues that the body of scientific knowledge, at any given historical moment, represents a \emph{local optimum} rather than a global one--that the frameworks, formalisms, and paradigms through which we understand nature are substantially shaped by historical contingency, cognitive path dependence, and institutional lock-in. Drawing an analogy to gradient descent in machine learning, we propose that science follows the steepest local gradient of tractability, empirical accessibility, and institutional reward, and in doing so may bypass fundamentally superior descriptions of nature. We develop this thesis through detailed case studies spanning mathematics, physics, chemistry, biology, neuroscience, and statistical methodology. We identify three interlocking mechanisms of lock-in--cognitive, formal, and institutional--and argue that recognizing these mechanisms is a prerequisite for designing meta-scientific strategies capable of escaping local optima. We conclude by proposing concrete interventions and discussing the epistemological implications of our thesis for the philosophy of science.
https://arxiv.org/abs/2604.11828
As text-to-image diffusion models grow increasingly prevalent, the ability to remove specific concepts-mostly explicit content and many copyrighted characters or styles-has become essential for safety and compliance. Existing unlearning approaches often require costly re-training, modify parameters at the cost of degradation of unrelated concept fidelity, or depend on indirect inference-time adjustment that compromise the effectiveness of concept erasure. Inspired by the success of energy-guided sampling for preservation of the condition of diffusion models, we introduce Energy-Guided Latent Optimization for Concept Erasure (EGLOCE), a training-free approach that removes unwanted concepts by re-directing noisy latent during inference. Our method employs a dual-objective framework: a repulsion energy that steers generation away from target concepts via gradient descent in latent space, and a retention energy that preserves semantic alignment to the original prompt. Combined with previous approaches that either require erroneous modified model weights or provide weak inference-time guidance, EGLOCE operates entirely at inference and enhances erasure performance, enabling plug-and-play integration. Extensive experiments demonstrate that EGLOCE improves concept removal while maintaining image quality and prompt alignment across baselines, even with adversarial attacks. To the best of our knowledge, our work is the first to establish a new paradigm for safe and controllable image generation through dual energy-based guidance during sampling.
https://arxiv.org/abs/2604.09405
We study how far structured architectural bias can compensate for the absence of end-to-end gradient-based representation learning in visual recognition. Building on the VisNet tradition, we introduce a modular hierarchical framework combining: (i) fixed multi-frequency Gabor decomposition into F=7 parallel streams; (ii) within-stream competitive learning with Hebbian and Oja updates and anti-Hebbian decorrelation; (iii) an associative memory module inspired by modern Hopfield retrieval; and (iv) iterative top-down modulation using local prediction and reconstruction signals. Representational layers are trained without end-to-end backpropagation through the full hierarchy; only the final linear readout and top-down projection matrices are optimized by gradient descent. We therefore interpret the model as a hybrid system that is predominantly locally trained but includes a small number of gradient-trained parameters. On CIFAR-10, the full model reaches 80.1% +/- 0.3% top-1 accuracy, linear probe), compared with 71.0% for a Hebbian-only baseline and 83.4% for a gradient-trained model on the same fixed Gabor basis. On CIFAR-100, performance is 54.8%. Factorial analysis indicates that multi-frequency streams, associative memory, and top-down feedback contribute largely additively, with a significant Streams x TopDown interaction (p=0.02). These results suggest that carefully chosen architectural priors can recover a substantial fraction of the performance typically associated with global gradient training, while leaving a measurable residual gap. Experiments are limited to CIFAR-10/100.
https://arxiv.org/abs/2604.09734
Quantum operations on pure states can be fully represented by unitary matrices. Variational quantum circuits, also known as quantum neural networks, embed data and trainable parameters into gate-based operations and optimize the parameters via gradient descent. The high cost of training and low fidelity of current quantum devices, however, restricts much of quantum machine learning to classical simulation. For few-qubit problems with large datasets, training the matrix elements directly, as is done with weight matrices in classical neural networks, can be faster than decomposing data and parameters into gates. We propose a method that trains matrices directly while maintaining unitarity through a single regularization term added to the loss function. A second training step, circuit alignment, then recovers a gate-based architecture from the resulting soft-unitary. On a five-qubit supervised classification task with 1000 datapoints, this two-step process produces a trained variational circuit in under four minutes, compared to over two hours for direct circuit training, while achieving lower binary cross-entropy loss. In a second experiment, soft-unitaries are embedded in a hybrid quantum-classical network for a reinforcement learning cartpole task, where the hybrid agent outperforms a purely classical baseline of comparable size.
https://arxiv.org/abs/2604.06523
State space models (SSMs) have been shown to possess the theoretical capacity to model both star-free sequential tasks and bounded hierarchical structures Sarrof et al. (2024). However, formal expressivity results do not guarantee that gradient-based optimisation will reliably discover the corresponding solutions. Existing benchmarks probe either monotonic state tracking, as in the standard Flip-Flop task, or structural nesting, as in the Dyck languages, but neither isolates reversible semantic state retrieval. We introduce the UNDO Flip-Flop task to fill this gap. By extending the standard Flip-Flop with an UNDO, the task requires a model to maintain an implicit bounded stack and recover historical states under non-monotonic update sequences. We evaluate one-layer and two-layer Mamba-2 under this framework. Both variants fail to acquire the provably expressible stack-based rollback mechanism, converging instead on a local toggle heuristic that inverts the current state rather than retrieving stored history. Under an adversarial retraction pressure test held within the training length distribution, the two-layer model collapses to 41.10% accuracy, which is below random chance. The results confirm systematic rather than incidental failure. Causal ablation shows that the bottleneck lies in retrieval, not storage. These results draw a clear line between what an architecture can in principle represent and what gradient descent reliably learns, a distinction that theoretical expressivity analyses alone cannot capture.
状态空间模型(SSMs)已被证明在理论上具备对无正则序列任务和有界分层结构进行建模的能力(Sarrof 等人,2024)。然而,理论表达能力并不能保证基于梯度的优化能可靠地发现相应解决方案。现有基准测试要么探测单调状态追踪(如标准 Flip-Flop 任务),要么探测结构嵌套(如 Dyck 语言),但两者均未孤立考察可逆语义状态检索。我们引入 UNDO Flip-Flop 任务以填补这一空白。通过在标准 Flip-Flop 基础上增加 UNDO 操作,该任务要求模型维持隐式有界栈,并在非单调更新序列下恢复历史状态。我们在此框架下评估了一层与两层 Mamba-2 模型。两种变体均未能习得理论上可表达的基于栈的回滚机制,反而收敛于一种仅反转当前状态而非检索存储历史的局部切换启发式策略。在训练长度分布内进行的对抗性回撤压力测试中,两层模型准确率降至 41.10%,低于随机概率。这些结果证实了系统性而非偶然性的失败。因果消融分析表明瓶颈在于检索而非存储。这些结果清晰划清了架构在原则上能表示什么与梯度下降能可靠学习什么之间的界限——这是单纯理论表达能力分析无法捕捉的区分。
https://arxiv.org/abs/2604.05923
Composed Image Retrieval (CIR) task aims to retrieve target images based on reference images and modification texts. Current CIR methods primarily rely on fine-tuning vision-language pre-trained models. However, we find that these approaches commonly suffer from severe overfitting, posing challenges for CIR with limited triplet data. To better understand this issue, we present a systematic study of overfitting in VLP-based CIR, revealing a significant and previously overlooked generalization gap across different models and datasets. Motivated by these findings, we introduce WRF4CIR, a Weight-Regularized Fine-tuning network for CIR. Specifically, during the fine-tuning process, we apply adversarial perturbations to the model weights for regularization, where these perturbations are generated in the opposite direction of gradient descent. Intuitively, WRF4CIR increases the difficulty of fitting the training data, which helps mitigate overfitting in CIR under limited triplet supervision. Extensive experiments on benchmark datasets demonstrate that WRF4CIR significantly narrows the generalization gap and achieves substantial improvements over existing methods.
组合图像检索(CIR)任务旨在基于参考图像和修改文本检索目标图像。当前CIR方法主要依赖对视觉-语言预训练模型进行微调。然而,我们发现这些方法普遍存在严重的过拟合问题,给有限三元组数据下的CIR带来挑战。为深入理解此问题,我们对基于VLP的CIR中的过拟合进行了系统性研究,揭示了一个显著且此前被忽视的跨模型与数据集的泛化差距。受此发现启发,我们提出了WRF4CIR,一种用于CIR的权重正则化微调网络。具体而言,在微调过程中,我们通过对模型权重施加对抗性扰动进行正则化,这些扰动在梯度下降的反方向生成。直观上,WRF4CIR增加了拟合训练数据的难度,从而有助于缓解有限三元组监督下CIR的过拟合问题。在基准数据集上的大量实验表明,WRF4CIR显著缩小了泛化差距,并相较于现有方法取得了实质性提升。
https://arxiv.org/abs/2604.05583
Gradient normalization is central in deep-learning optimization because it stabilizes training and reduces sensitivity to scale. For deep architectures, parameters are naturally grouped into matrices or blocks, so spectral normalizations are often more faithful than coordinatewise Euclidean ones; Muon is the main motivating example of this paper. More broadly, we study a family of spectral normalization rules, ranging from ordinary gradient descent to Muon and intermediate Schatten-type schemes, in a mean-field regime where parameters are modeled by probability measures. We introduce a family of Spectral Wasserstein distances indexed by a norm gamma on positive semidefinite matrices. The trace norm recovers the classical quadratic Wasserstein distance, the operator norm recovers the Muon geometry, and intermediate Schatten norms interpolate between them. We develop the static Kantorovich formulation, prove comparison bounds with W2, derive a max-min representation, and obtain a conditional Brenier theorem. For Gaussian marginals, the problem reduces to a constrained optimization on covariance matrices, extending the Bures formula and yielding a closed form for commuting covariances in the Schatten family. For monotone norms, including all Schatten cases, we prove the equivalence between the static and dynamic Benamou-Brenier formulations, deduce that the resulting transport cost is a genuine metric equivalent to W2 in fixed dimension, and show that the induced Gaussian covariance cost is also a metric. We then interpret the associated normalized continuity equation as a Spectral Wasserstein gradient flow, identify its exact finite-particle counterpart as a normalized matrix flow, obtain first geodesic-convexity results, and show how positively homogeneous mean-field models induce a spectral unbalanced transport on the sphere.
梯度归一化在深度学习优化中至关重要,因为它能稳定训练过程并降低对尺度变化的敏感性。对于深度架构,参数自然按矩阵或分块分组,因此谱归一化通常比逐坐标欧几里得归一化更忠实;本文的主要动机案例Muon即体现了这一特性。更广泛地,我们在参数被建模为概率测度的平均场框架下,研究了一类谱归一化规则,涵盖从普通梯度下降到Muon及中间施瓦茨型方案的连续谱系。我们引入了一族以正半定矩阵上的范数γ为指标的谱Wasserstein距离。迹范数恢复经典二次Wasserstein距离,算子范数恢复Muon几何,中间施瓦茨范数则实现两者间的插值。我们发展了静态康托罗维奇公式,证明了与W₂的比较界,推导了极大极小表示,并获得了条件布伦耶定理。对于高斯边缘分布,问题简化为协方差矩阵上的约束优化,推广了伯斯公式,并给出了施瓦茨族中对易协方差的闭式解。对于单调范数(包括所有施瓦茨情形),我们证明了静态与动态贝纳穆-布伦耶公式的等价性,推导出所得输运成本是与W₂在固定维度下等价的真度量,并表明诱导的高斯协方差成本也是度量。随后,我们将相关的归一化连续性方程解释为谱Wasserstein梯度流,将其精确有限粒子对应识别为归一化矩阵流,获得了首个测地凸性结果,并展示了正齐次平均场模型如何在球面上诱导谱非平衡输运。
https://arxiv.org/abs/2604.04891
Autonomous vehicles increasingly rely on deep learning-based perception and control, which impose substantial computational demands. Cloud-assisted architectures offload these functions to remote servers, enabling enhanced perception and coordinated decision-making through the Internet of Vehicles (IoV). However, this paradigm introduces cross-layer vulnerabilities, where adversarial manipulation of perception models and network impairments in the vehicle-cloud link can jointly undermine safety-critical autonomy. This paper presents a hardware-in-the-loop IoV testbed that integrates real-time perception, control, and communication to evaluate such vulnerabilities in cloud-assisted autonomous driving. A YOLOv8-based object detector deployed on the cloud is subjected to whitebox adversarial attacks using the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD), while network adversaries induce delay and packet loss in the vehicle-cloud loop. Results show that adversarial perturbations significantly degrade perception performance, with PGD reducing detection precision and recall from 0.73 and 0.68 in the clean baseline to 0.22 and 0.15 at epsilon= 0.04. Network delays of 150-250 ms, corresponding to transient losses of approximately 3-4 frames, and packet loss rates of 0.5-5 % further destabilize closed-loop control, leading to delayed actuation and rule violations. These findings highlight the need for cross-layer resilience in cloud-assisted autonomous driving systems.
https://arxiv.org/abs/2604.04349
Stochastic bilevel optimization (SBO) has been integrated into many machine learning paradigms recently, including hyperparameter optimization, meta learning, and reinforcement learning. Along with the wide range of applications, there have been numerous studies on the computational behavior of SBO. However, the generalization guarantees of SBO methods are far less understood from the lens of statistical learning theory. In this paper, we provide a systematic generalization analysis of the first-order gradient-based bilevel optimization methods. Firstly, we establish the quantitative connections between the on-average argument stability and the generalization gap of SBO methods. Then, we derive the upper bounds of on-average argument stability for single-timescale stochastic gradient descent (SGD) and two-timescale SGD, where three settings (nonconvex-nonconvex (NC-NC), convex-convex (C-C), and strongly-convex-strongly-convex (SC-SC)) are considered respectively. Experimental analysis validates our theoretical findings. Compared with the previous algorithmic stability analysis, our results do not require reinitializing the inner-level parameters at each iteration and are applicable to more general objective functions.
随机双层优化(SBO)近期已被整合进多种机器学习范式,包括超参数优化、元学习与强化学习。随着应用范围的不断拓展,学界对SBO的计算行为展开了大量研究。然而,从统计学习理论视角审视,SBO方法的泛化保证机制仍远未被充分理解。本文对基于一阶梯度的双层优化方法进行了系统性泛化分析。首先,我们建立了SBO方法平均论证稳定性与泛化间隙之间的定量关联。随后,我们在非凸-非凸(NC-NC)、凸-凸(C-C)及强凸-强凸(SC-SC)三种设定下,分别推导了单时间尺度随机梯度下降(SGD)与双时间尺度SGD的平均论证稳定性上界。实验分析验证了我们的理论发现。相较于既有算法稳定性分析,本研究结果无需在每轮迭代中重新初始化内层参数,且适用于更广泛的目标函数形式。
https://arxiv.org/abs/2604.04090
We analyze the last-iterate convergence of the Anchored Gradient Descent Ascent algorithm for smooth convex-concave min-max problems. While previous work established a last-iterate rate of $\mathcal{O}(1/t^{2-2p})$ for the squared gradient norm, where $p \in (1/2, 1)$, it remained an open problem whether the improved exact $\mathcal{O}(1/t)$ rate is achievable. In this work, we resolve this question in the affirmative. This result was discovered autonomously by an AI system capable of writing formal proofs in Lean. The Lean proof can be accessed at this https URL
我们分析了锚定梯度下降上升算法在光滑凸凹极小极大问题中的最后迭代收敛性。先前工作已证明,当 $p \in (1/2, 1)$ 时,该算法关于平方梯度范数的最后迭代收敛率为 $\mathcal{O}(1/t^{2-2p})$,但更优的精确 $\mathcal{O}(1/t)$ 收敛率是否可达仍是一个未解问题。本研究肯定地回答了这一问题。该成果由一个能够用 Lean 语言撰写形式化证明的 AI 系统自主发现。Lean 证明可在此 https 链接获取。
https://arxiv.org/abs/2604.03782
All prior membership inference attacks for fine-tuned language models use hand-crafted heuristics (e.g., loss thresholding, Min-K\%, reference calibration), each bounded by the designer's intuition. We introduce the first transferable learned attack, enabled by the observation that fine-tuning any model on any corpus yields unlimited labeled data, since membership is known by construction. This removes the shadow model bottleneck and brings membership inference into the deep learning era: learning what matters rather than designing it, with generalization through training diversity and scale. We discover that fine-tuning language models produces an invariant signature of memorization detectable across architectural families and data domains. We train a membership inference classifier exclusively on transformer-based models. It transfers zero-shot to Mamba (state-space), RWKV-4 (linear attention), and RecurrentGemma (gated recurrence), achieving 0.963, 0.972, and 0.936 AUC respectively. Each evaluation combines an architecture and dataset never seen during training, yet all three exceed performance on held-out transformers (0.908 AUC). These four families share no computational mechanisms, their only commonality is gradient descent on cross-entropy loss. Even simple likelihood-based methods exhibit strong transfer, confirming the signature exists independently of the detection method. Our method, Learned Transfer MIA (LT-MIA), captures this signal most effectively by reframing membership inference as sequence classification over per-token distributional statistics. On transformers, LT-MIA achieves 2.8$\times$ higher TPR at 0.1\% FPR than the strongest baseline. The method also transfers to code (0.865 AUC) despite training only on natural language texts. Code and trained classifier available at this https URL.
此前针对微调语言模型的所有成员推断攻击均依赖手工设计的启发式规则(例如损失阈值法、Min-K%、参考校准),其性能受设计者直觉的限制。我们提出了首个可迁移的习得攻击,其核心发现是:任何模型在任何语料库上微调时,由于成员身份在构造时已知,因此能产生无限标记数据。这消除了影子模型瓶颈,将成员推断带入深度学习时代——从“设计特征”转向“学习特征”,通过训练多样性和规模实现泛化。我们发现,微调语言模型会产生一个跨架构家族和数据域可检测的恒定记忆签名。我们仅在基于Transformer的模型上训练成员推断分类器,并实现零样本迁移至Mamba(状态空间模型)、RWKV-4(线性注意力)和RecurrentGemma(门控循环网络), respective AUC分别达到0.963、0.972和0.936。每次评估均结合训练中未见的架构与数据集,但三者性能均超过在保留Transformer上的表现(0.908 AUC)。这四类架构共享无计算机制,其唯一共同点是基于交叉熵损失的梯度下降。即使简单基于似然的方法也展现出强迁移性,证实该签名独立于检测方法存在。我们的方法——习得迁移成员推断(LT-MIA)——通过将成员推断重构为基于每标记分布统计的序列分类任务,最有效地捕获该信号。在Transformer上,LT-MIA在0.1%假阳性率下实现比最强基线高2.8倍的真阳性率。该方法即使仅使用自然语言文本训练,也能迁移至代码领域(0.865 AUC)。代码与训练好的分类器可通过此链接获取。
https://arxiv.org/abs/2604.03199
Energy-based models (EBMs) implement inference as gradient descent on a learned Lyapunov function, yielding interpretable, structure-preserving alternatives to black-box neural ODEs and aligning naturally with physical AI. Yet their use in system identification remains limited, and existing architectures lack formal stability guarantees that globally preclude unstable modes. We address this gap by introducing an EBM framework for system identification with stable, dissipative, absorbing invariant dynamics. Unlike classical global Lyapunov stability, absorbing invariance expands the class of stability-preserving architectures, enabling more flexible and expressive EBMs. We extend EBM theory to nonsmooth activations by establishing negative energy dissipation via Clarke derivatives and deriving new conditions for radial unboundedness, exposing a stability-expressivity tradeoff in standard EBMs. To overcome this, we introduce a hybrid architecture with a dynamical visible layer and static hidden layers, prove absorbing invariance under mild assumptions, and show that these guarantees extend to port-Hamiltonian EBMs. Experiments on metric-deformed multi-well and ring systems validate the approach, showcasing how our hybrid EBM architecture combines expressivity with sound and provable safety guarantees by design.
基于能量的模型(EBMs)通过在学习得到的李雅普诺夫函数上执行梯度下降来实现推断,为黑箱神经常微分方程提供了可解释、结构保持的替代方案,并与物理人工智能自然契合。然而,其在系统辨识中的应用仍较为有限,且现有架构缺乏全局排除不稳定模式的正式稳定性保证。本文通过引入一个具备稳定、耗散、吸收不变动力学的EBM框架来弥补这一空白。与经典的全局李雅普诺夫稳定性不同,吸收不变性扩展了稳定性保持架构的范畴,从而支持更灵活、更具表达力的EBMs。我们通过克拉克导数建立负能量耗散,并推导径向无界性的新条件,将EBM理论扩展至非光滑激活函数,揭示了标准EBMs中存在的稳定性-表达性权衡。为克服此局限,我们提出一种混合架构:动态可见层与静态隐藏层,在温和假设下证明其吸收不变性,并表明此类保证可延伸至端口哈密顿EBMs。在度量变形多阱与环系统上的实验验证了该方法,展示了我们的混合EBM架构如何通过设计将表达力与可靠且可证明的安全性保证相结合。
https://arxiv.org/abs/2604.00277
Single-shot neural decoders commit to answers without iterative refinement, while chain-of-thought methods introduce discrete intermediate steps but lack a scalar measure of reasoning progress. We propose Energy-Based Reasoning via Structured Latent Planning (EBRM), which models reasoning as gradient-based optimization of a multi-step latent trajectory $z_{1:T}$ under a learned energy function $E(h_x, z)$. The energy decomposes into per-step compatibility, transition consistency, and trajectory smoothness terms. Training combines supervised encoder-decoder learning with contrastive energy shaping using hard negatives, while inference performs gradient descent or Langevin dynamics over $z$ and decodes from $z_T$. We identify a critical failure mode: on CNF logic satisfaction, latent planning reduces accuracy from $\approx 95\%$ to $\approx 56\%$. This degradation arises from a distribution mismatch, where the decoder is trained on encoder outputs $h_x$ but evaluated on planner outputs $z_T$ that drift into unseen latent regions. We analyze this behavior through per-step decoding, latent drift tracking, and gradient decomposition. To address it, we propose dual-path decoder training and latent anchoring. We further introduce a six-part ablation protocol covering component contributions, trajectory length, planner dynamics, initialization, decoder training distribution, and anchor weight. Experiments on three synthetic tasks show that energy decreases monotonically and induces structured latent trajectories on graph and logic tasks, while remaining flat on arithmetic ($r = 0.073$), indicating a negative result. Code is available at this https URL.
https://arxiv.org/abs/2603.28248
Accurate and temporally consistent segmentation of the left ventricle from echocardiography videos is essential for estimating the ejection fraction and assessing cardiac function. However, modeling spatiotemporal dynamics remains difficult due to severe speckle noise and rapid non-rigid deformations. Existing linear recurrent models offer efficient in-context associative recall for temporal tracking, but rely on unconstrained state updates, which cause progressive singular value decay in the state matrix, a phenomenon known as rank collapse, resulting in anatomical details being overwhelmed by noise. To address this, we propose OSA, a framework that constrains the state evolution on the Stiefel manifold. We introduce the Orthogonalized State Update (OSU) mechanism, which formulates the memory evolution as Euclidean projected gradient descent on the Stiefel manifold to prevent rank collapse and maintain stable temporal transitions. Furthermore, an Anatomical Prior-aware Feature Enhancement module explicitly separates anatomical structures from speckle noise through a physics-driven process, providing the temporal tracker with noise-resilient structural cues. Comprehensive experiments on the CAMUS and EchoNet-Dynamic datasets show that OSA achieves state-of-the-art segmentation accuracy and temporal stability, while maintaining real-time inference efficiency for clinical deployment. Codes are available at this https URL.
准确且时序一致地从超声心动图视频中分割左心室,对于估算射血分数和评估心脏功能至关重要。然而,由于严重的斑点噪声和快速的非刚性形变,对时空动态的建模仍然困难。现有的线性循环模型虽能提供高效的上下文关联记忆以实现时序追踪,但其依赖无约束的状态更新,导致状态矩阵出现渐进奇异值衰减——即秩崩溃现象,使解剖细节被噪声淹没。针对此问题,我们提出了OSA框架,该框架将状态演化约束在Stiefel流形上。我们引入了正交状态更新(OSU)机制,将记忆演化表述为Stiefel流形上的欧几里得投影梯度下降,以防止秩崩溃并维持稳定的时序过渡。此外,解剖先验感知特征增强模块通过物理驱动过程明确分离解剖结构与斑点噪声,为时序追踪器提供抗噪声的结构化线索。在CAMUS和EchoNet-Dynamic数据集上的综合实验表明,OSA在实现最先进分割精度与时序稳定性的同时,保持了临床部署所需的实时推理效率。代码已公开于此https URL。
https://arxiv.org/abs/2603.26188
Random cropping is one of the most common data augmentation techniques in computer vision, yet the role of its inherent randomness in training differentially private machine learning models has thus far gone unexplored. We observe that when sensitive content in an image is spatially localized, such as a face or license plate, random cropping can probabilistically exclude that content from the model's input. This introduces a third source of stochasticity in differentially private training with stochastic gradient descent, in addition to gradient noise and minibatch sampling. This additional randomness amplifies differential privacy without requiring changes to model architecture or training procedure. We formalize this effect by introducing a patch-level neighboring relation for vision data and deriving tight privacy bounds for differentially private stochastic gradient descent (DP-SGD) when combined with random cropping. Our analysis quantifies the patch inclusion probability and shows how it composes with minibatch sampling to yield a lower effective sampling rate. Empirically, we validate that patch-level amplification improves the privacy-utility trade-off across multiple segmentation architectures and datasets. Our results demonstrate that aligning privacy accounting with domain structure and additional existing sources of randomness can yield stronger guarantees at no additional cost.
随机裁剪是计算机视觉中最常见的数据增强技术之一,但其内在随机性在训练差分隐私机器学习模型中的作用迄今尚未被探索。我们观察到,当图像中的敏感内容(如面部或车牌)呈空间局部化分布时,随机裁剪可以概率性地将这些内容排除在模型输入之外。这为采用随机梯度下降进行差分隐私训练引入了除梯度噪声和小批量采样之外的第三种随机性来源。这种额外的随机性无需更改模型架构或训练流程即可增强差分隐私。我们通过引入视觉数据的块级邻近关系来形式化这一效应,并为结合随机裁剪的差分隐私随机梯度下降(DP-SGD)推导出紧致的隐私界限。我们的分析量化了块包含概率,并展示了其如何与小批量采样组合以产生更低的有效采样率。经验上,我们验证了块级增强在多个分割架构和数据集上改善了隐私-效用权衡。我们的结果表明,将隐私核算与领域结构及现有随机性来源对齐,可以无需额外成本地提供更强的保障。
https://arxiv.org/abs/2603.24695
Mechanistic interpretability reveals that safety-critical behaviors (e.g., alignment, jailbreak, backdoor) in Large Language Models (LLMs) are grounded in specialized functional components. However, existing safety attribution methods struggle with generalization and reliability due to their reliance on heuristic, domain-specific metrics and search algorithms. To address this, we propose \ourmethod, a unified safety interpretability framework that identifies functionally complete safety circuits in LLMs via optimization. Unlike methods focusing on isolated heads or neurons, \ourmethod introduces differentiable binary masks to extract multi-granular circuits through gradient descent on safety datasets, while integrates Safety Circuit Tuning to utilize these sparse circuits for efficient safety fine-tuning. We validate \ourmethod in two key scenarios in LLM safety: \textbf{(1) backdoor attacks}, identifying a backdoor circuit with 0.42\% sparsity, whose ablation eradicates the Attack Success Rate (ASR) from 100\% $\to$ 0.4\% while retaining over 99\% general utility; \textbf{(2) safety alignment}, localizing an alignment circuit with 3.03\% heads and 0.79\% neurons, whose removal spikes ASR from 0.8\% $\to$ 96.9\%, whereas excluding this circuit during helpfulness fine-tuning maintains 96.5\% safety retention.
机制可解释性揭示,大型语言模型(LLMs)中的安全关键行为(如对齐、越狱、后门)根植于特定的功能组件。然而,现有安全归因方法因依赖启发式、领域特定的指标与搜索算法,在泛化性与可靠性方面存在局限。为此,我们提出\ourmethod\——一个通过优化识别LLMs中功能完整安全电路的统一安全可解释性框架。不同于聚焦孤立注意力头或神经元的方法,\ourmethod\引入可微分二值掩码,通过在安全数据集上进行梯度下降提取多粒度电路,同时整合安全电路调优(Safety Circuit Tuning)以利用这些稀疏电路实现高效安全微调。我们在LLM安全的两大关键场景中验证\ourmethod\:\textbf{(1) 后门攻击},识别出稀疏度仅0.42%的后门电路,其消融将攻击成功率(ASR)从100%降至0.4%,同时保留超过99%的通用性能;\textbf{(2) 安全对齐},定位出包含3.03%注意力头与0.79%神经元的对齐电路,移除该电路使ASR从0.8%飙升至96.9%,而在助人性微调中排除此电路仍能保持96.5%的安全保留率。
https://arxiv.org/abs/2603.23268
Three-dimensional (3D) handheld photoacoustic tomography typically relies on bulky and expensive external positioning sensors to correct motion artifacts, which severely limits its clinical flexibility and accessibility. To address this challenge, we present PA-SFM, a tracker-free framework that leverages exclusively single-modality photoacoustic data for both sensor pose recovery and high-fidelity 3D reconstruction via differentiable acoustic radiation modeling. Unlike traditional structure-from-motion (SFM) methods based on visual features, PA-SFM integrates the acoustic wave equation into a differentiable programming pipeline. By leveraging a high-performance, GPU-accelerated acoustic radiation kernel, the framework simultaneously optimizes the 3D photoacoustic source distribution and the sensor array pose via gradient descent. To ensure robust convergence in freehand scenarios, we introduce a coarse-to-fine optimization strategy that incorporates geometric consistency checks and rigid-body constraints to eliminate motion outliers. We validated the proposed method through both numerical simulations and in-vivo rat experiments. The results demonstrate that PA-SFM achieves sub-millimeter positioning accuracy and restores high-resolution 3D vascular structures comparable to ground-truth benchmarks, offering a low-cost, software-defined solution for clinical freehand photoacoustic imaging. The source code is publicly available at \href{this https URL}{this https URL}.
https://arxiv.org/abs/2604.09643