Paper Reading AI Learner

A Method of Moments Embedding Constraint and its Application to Semi-Supervised Learning

2024-04-27 18:41:32
Michael Majurski, Sumeet Menon, Parniyan Farvardin, David Chapman

Abstract

Discriminative deep learning models with a linear+softmax final layer have a problem: the latent space only predicts the conditional probabilities $p(Y|X)$ but not the full joint distribution $p(Y,X)$, which necessitates a generative approach. The conditional probability cannot detect outliers, causing outlier sensitivity in softmax networks. This exacerbates model over-confidence impacting many problems, such as hallucinations, confounding biases, and dependence on large datasets. To address this we introduce a novel embedding constraint based on the Method of Moments (MoM). We investigate the use of polynomial moments ranging from 1st through 4th order hyper-covariance matrices. Furthermore, we use this embedding constraint to train an Axis-Aligned Gaussian Mixture Model (AAGMM) final layer, which learns not only the conditional, but also the joint distribution of the latent space. We apply this method to the domain of semi-supervised image classification by extending FlexMatch with our technique. We find our MoM constraint with the AAGMM layer is able to match the reported FlexMatch accuracy, while also modeling the joint distribution, thereby reducing outlier sensitivity. We also present a preliminary outlier detection strategy based on Mahalanobis distance and discuss future improvements to this strategy. Code is available at: \url{this https URL}

Abstract (translated)

具有线性+软度的最后层区分性深度学习模型的一个问题在于:潜在空间仅预测条件概率$p(Y|X)$,而不能预测完整的联合分布$p(Y,X)$,这就需要采用生成方法。条件概率无法检测到异常值,导致软度网络的异常敏感性。这加剧了过度自信的影响,影响了诸如幻觉、混淆偏见和依赖大数据集等问题。为解决这个问题,我们引入了一种基于方法 of moments(MoM)的新嵌入约束。我们研究了使用多项式时刻的线性组合。此外,我们还使用此嵌入约束训练了一个轴向对齐高斯混合模型(AAGMM)的最终层,该模型不仅学习条件概率,还学习潜在空间的联合分布。我们将这种方法应用于半监督图像分类领域,通过扩展我们的技术实现FlexMatch。我们发现,与AAGMM层一起使用MoM嵌入约束能够匹配报告的FlexMatch准确率,同时建模联合分布,从而降低异常敏感性。我们还提出了一个初步的异常检测策略基于Mahalanobis距离,并讨论了未来改进此策略的可能性。代码可在此处下载:\url{这个链接}

URL

https://arxiv.org/abs/2404.17978

PDF

https://arxiv.org/pdf/2404.17978.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot