Paper Reading AI Learner

Prompt-Driven Feature Diffusion for Open-World Semi-Supervised Learning

2024-04-17 23:10:11
Marzi Heidari, Hanping Zhang, Yuhong Guo

Abstract

In this paper, we present a novel approach termed Prompt-Driven Feature Diffusion (PDFD) within a semi-supervised learning framework for Open World Semi-Supervised Learning (OW-SSL). At its core, PDFD deploys an efficient feature-level diffusion model with the guidance of class-specific prompts to support discriminative feature representation learning and feature generation, tackling the challenge of the non-availability of labeled data for unseen classes in OW-SSL. In particular, PDFD utilizes class prototypes as prompts in the diffusion model, leveraging their class-discriminative and semantic generalization ability to condition and guide the diffusion process across all the seen and unseen classes. Furthermore, PDFD incorporates a class-conditional adversarial loss for diffusion model training, ensuring that the features generated via the diffusion process can be discriminatively aligned with the class-conditional features of the real data. Additionally, the class prototypes of the unseen classes are computed using only unlabeled instances with confident predictions within a semi-supervised learning framework. We conduct extensive experiments to evaluate the proposed PDFD. The empirical results show PDFD exhibits remarkable performance enhancements over many state-of-the-art existing methods.

Abstract (translated)

在本文中,我们在半监督学习框架下提出了一种名为提示驱动特征扩散(PDFD)的新方法,用于开放世界半监督学习(OW-SSL)。其核心思想是,PDFD通过类特定提示来指导类级别特征级扩散模型,支持分类特征表示学习和特征生成,解决了OW-SSL中未见类别的标签数据不足的挑战。 具体来说,PDFD利用类原型作为扩散模型的提示,利用它们的分类歧视性和语义泛化能力来对所有可见和不可见类别的扩散过程进行条件和引导。此外,PDFD引入了分类条件 adversarial loss for diffusion model training,确保通过扩散过程生成的特征与真实数据的类条件特征对齐。 另外,类原型的计算仅在半监督学习框架中使用具有自信预测的未标注实例。我们通过广泛的实验评估了所提出的PDFD。实验结果表明,与现有方法相比,PDFD具有显著的性能增强。

URL

https://arxiv.org/abs/2404.11795

PDF

https://arxiv.org/pdf/2404.11795.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot