Paper Reading AI Learner

A Unified Theory of Random Projection for Influence Functions

2026-02-11 02:42:04
Pingbang Hu, Yuzheng Hu, Jiaqi W. Ma, Han Zhao

Abstract

Influence functions and related data attribution scores take the form of $g^{\top}F^{-1}g^{\prime}$, where $F\succeq 0$ is a curvature operator. In modern overparameterized models, forming or inverting $F\in\mathbb{R}^{d\times d}$ is prohibitive, motivating scalable influence computation via random projection with a sketch $P \in \mathbb{R}^{m\times d}$. This practice is commonly justified via the Johnson--Lindenstrauss (JL) lemma, which ensures approximate preservation of Euclidean geometry for a fixed dataset. However, JL does not address how sketching behaves under inversion. Furthermore, there is no existing theory that explains how sketching interacts with other widely-used techniques, such as ridge regularization and structured curvature approximations. We develop a unified theory characterizing when projection provably preserves influence functions. When $g,g^{\prime}\in\text{range}(F)$, we show that: 1) Unregularized projection: exact preservation holds iff $P$ is injective on $\text{range}(F)$, which necessitates $m\geq \text{rank}(F)$; 2) Regularized projection: ridge regularization fundamentally alters the sketching barrier, with approximation guarantees governed by the effective dimension of $F$ at the regularization scale; 3) Factorized influence: for Kronecker-factored curvatures $F=A\otimes E$, the guarantees continue to hold for decoupled sketches $P=P_A\otimes P_E$, even though such sketches exhibit row correlations that violate i.i.d. assumptions. Beyond this range-restricted setting, we analyze out-of-range test gradients and quantify a \emph{leakage} term that arises when test gradients have components in $\ker(F)$. This yields guarantees for influence queries on general test points. Overall, this work develops a novel theory that characterizes when projection provably preserves influence and provides principled guidance for choosing the sketch size in practice.

Abstract (translated)

影响函数及相关数据属性分数的形式为 $g^{\top}F^{-1}g'$,其中$F \succeq 0$ 是一个曲率算子。在现代过度参数化的模型中,形成或求逆 $F\in\mathbb{R}^{d\times d}$ 是不切实际的,这促使通过随机投影和草图(sketch)$P \in \mathbb{R}^{m\times d}$ 来实现可扩展的影响计算。这种做法通常基于约翰逊-林登施特劳斯(JL)引理进行合理化,该引理确保固定数据集的欧氏几何在近似意义上得到保持。然而,JL引理并没有解决草图在求逆时的行为问题。此外,没有现有的理论解释草图与诸如岭回归和结构曲率近似等其他广泛使用的技术之间的相互作用。 我们开发了一个统一的理论来描述投影何时可以证明地保留影响函数。 当 $g,g' \in \text{range}(F)$ 时,我们展示了: 1) **未正则化的投影**:精确保持成立的条件是$P$在$\text{range}(F)$上单射,这需要$m\geq \text{rank}(F)$; 2) **正则化投影**:岭回归从根本上改变了草图障碍,并且近似保证由$F$的有效维度在正则化规模下控制; 3) **分解影响函数**:对于克罗内克积构成的曲率 $F=A\otimes E$,即使这种草图展示出行相关性违反独立同分布(i.i.d.)假设的情况,对于解耦草图$P=P_A\otimes P_E$ 保证仍然成立。 超出这个限制范围设定,在测试梯度超出了范围时我们分析了它们的行为,并量化了一个**泄漏**项,该术语在测试梯度具有$\ker(F)$成分的情况下出现。这为一般测试点的影响查询提供了保证。总的来说,这项工作开发了一种新的理论来描述投影何时可以证明地保留影响函数,并且为实践中选择草图大小提供原理性的指导。

URL

https://arxiv.org/abs/2602.10449

PDF

https://arxiv.org/pdf/2602.10449.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot