Paper Reading AI Learner

Deep Regression Representation Learning with Topology

2024-04-22 06:28:41
Shihao Zhang, kenji kawaguchi, Angela Yao

Abstract

Most works studying representation learning focus only on classification and neglect regression. Yet, the learning objectives and therefore the representation topologies of the two tasks are fundamentally different: classification targets class separation, leading to disconnected representations, whereas regression requires ordinality with respect to the target, leading to continuous representations. We thus wonder how the effectiveness of a regression representation is influenced by its topology, with evaluation based on the Information Bottleneck (IB) principle. The IB principle is an important framework that provides principles for learning effectiveness representations. We establish two connections between it and the topology of regression representations. The first connection reveals that a lower intrinsic dimension of the feature space implies a reduced complexity of the representation Z. This complexity can be quantified as the conditional entropy of Z on the target space Y and serves as an upper bound on the generalization error. The second connection suggests learning a feature space that is topologically similar to the target space will better align with the IB principle. Based on these two connections, we introduce PH-Reg, a regularizer specific to regression that matches the intrinsic dimension and topology of the feature space with the target space. Experiments on synthetic and real-world regression tasks demonstrate the benefits of PH-Reg.

Abstract (translated)

大多数研究代表学习的研究仅关注分类,而忽视了回归。然而,学习目标因此两个任务的表示拓扑是根本不同的:分类目标是分类,导致离散表示,而回归需要目标空间的有序性,导致连续表示。因此,我们想知道回归表示的有效性如何受到其拓扑结构的影响,评估基于信息瓶颈(IB)原理。IB原理是一个重要的学习有效性表示的框架。我们建立了它与回归表示拓扑之间的两个联系。第一个联系揭示了特征空间内特征维度较低意味着表示Z的复杂性降低。这种复杂性可以定量为目标空间Y的条件熵,作为一般化误差的上界。第二个联系建议学习一个与目标空间拓扑结构相似的特征空间将更好地符合IB原理。基于这些两个联系,我们引入了PH-Reg,一个特定于回归的 regularizer,它与特征空间的内维度和拓扑结构与目标空间的拓扑结构相匹配。在合成和现实世界的回归任务上的实验证明了PH-Reg的好处。

URL

https://arxiv.org/abs/2404.13904

PDF

https://arxiv.org/pdf/2404.13904.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot