Paper Reading AI Learner

Beyond Pooling: Matching for Robust Generalization under Data Heterogeneity

2026-02-06 19:56:02
Ayush Roy, Rudrasis Chakraborty, Lav Varshney, Vishnu Suresh Lokhande

Abstract

Pooling heterogeneous datasets across domains is a common strategy in representation learning, but naive pooling can amplify distributional asymmetries and yield biased estimators, especially in settings where zero-shot generalization is required. We propose a matching framework that selects samples relative to an adaptive centroid and iteratively refines the representation distribution. The double robustness and the propensity score matching for the inclusion of data domains make matching more robust than naive pooling and uniform subsampling by filtering out the confounding domains (the main cause of heterogeneity). Theoretical and empirical analyses show that, unlike naive pooling or uniform subsampling, matching achieves better results under asymmetric meta-distributions, which are also extended to non-Gaussian and multimodal real-world settings. Most importantly, we show that these improvements translate to zero-shot medical anomaly detection, one of the extreme forms of data heterogeneity and asymmetry. The code is available on this https URL.

Abstract (translated)

跨领域汇集异构数据集是表示学习中的一种常见策略,但简单的汇集方法可能会放大分布不对称性,并导致偏差估计问题,尤其是在需要零样本泛化的情况下。我们提出了一种匹配框架,该框架根据自适应中心点选择样本并迭代优化表示分布。通过双重稳健性和倾向得分匹配来处理数据领域包含的问题,使得匹配比简单的汇集和均匀抽样更鲁棒,能够过滤掉导致异质性的混淆领域(即主要的异构原因)。理论分析和实证研究表明,在不对称元分布下,与简单汇集或均匀抽样相比,匹配方法能取得更好的结果,并且这些改进还可以扩展到非高斯和多模态的真实世界设置中。最重要的是,我们展示了在零样本医学异常检测中的这些改进效果,这种情况下数据异质性和不对称性尤为极端。相关代码可在提供的URL上找到。

URL

https://arxiv.org/abs/2602.07154

PDF

https://arxiv.org/pdf/2602.07154.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot