Paper Reading AI Learner

GCC: Generative Calibration Clustering

2024-04-14 01:51:11
Haifeng Xia, Hai Huang, Zhengming Ding

Abstract

Deep clustering as an important branch of unsupervised representation learning focuses on embedding semantically similar samples into the identical feature space. This core demand inspires the exploration of contrastive learning and subspace clustering. However, these solutions always rely on the basic assumption that there are sufficient and category-balanced samples for generating valid high-level representation. This hypothesis actually is too strict to be satisfied for real-world applications. To overcome such a challenge, the natural strategy is utilizing generative models to augment considerable instances. How to use these novel samples to effectively fulfill clustering performance improvement is still difficult and under-explored. In this paper, we propose a novel Generative Calibration Clustering (GCC) method to delicately incorporate feature learning and augmentation into clustering procedure. First, we develop a discriminative feature alignment mechanism to discover intrinsic relationship across real and generated samples. Second, we design a self-supervised metric learning to generate more reliable cluster assignment to boost the conditional diffusion generation. Extensive experimental results on three benchmarks validate the effectiveness and advantage of our proposed method over the state-of-the-art methods.

Abstract (translated)

深度聚类作为无监督表示学习的一个重要分支,专注于将语义相似的样本嵌入到相同的特征空间中。这一核心需求引发了对比学习以及子空间聚类的探索。然而,这些解决方案总是依赖于生成模型生成足够且类别平衡的样本来生成有效的高级表示的基本假设。这个假设实际上过于严格,无法满足现实世界的应用需求。为了克服这一挑战,自然策略是利用生成模型来增加大量的实例。然而,如何有效地利用这些新颖样本进行聚类性能的改进仍然很难,并且没有被充分探索。在本文中,我们提出了一种新颖的生成校准聚类(GCC)方法,将特征学习和增强融入聚类过程。首先,我们开发了一个判别特征对齐机制,以发现真实和生成样本之间的内在关系。其次,我们设计了一个自监督的度量学习,以生成更可靠的聚类分配来提高条件扩散生成。在三个基准测试上进行的大量实验结果证实了与最先进方法相比,我们提出的方法的有效性和优势。

URL

https://arxiv.org/abs/2404.09115

PDF

https://arxiv.org/pdf/2404.09115.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot