Paper Reading AI Learner

Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models

2023-05-16 20:17:02
Na Li, Hanane Kteich, Zied Bouraoui, Steven Schockaert

Abstract

Learning vectors that capture the meaning of concepts remains a fundamental challenge. Somewhat surprisingly, perhaps, pre-trained language models have thus far only enabled modest improvements to the quality of such concept embeddings. Current strategies for using language models typically represent a concept by averaging the contextualised representations of its mentions in some corpus. This is potentially sub-optimal for at least two reasons. First, contextualised word vectors have an unusual geometry, which hampers downstream tasks. Second, concept embeddings should capture the semantic properties of concepts, whereas contextualised word vectors are also affected by other factors. To address these issues, we propose two contrastive learning strategies, based on the view that whenever two sentences reveal similar properties, the corresponding contextualised vectors should also be similar. One strategy is fully unsupervised, estimating the properties which are expressed in a sentence from the neighbourhood structure of the contextualised word embeddings. The second strategy instead relies on a distant supervision signal from ConceptNet. Our experimental results show that the resulting vectors substantially outperform existing concept embeddings in predicting the semantic properties of concepts, with the ConceptNet-based strategy achieving the best results. These findings are furthermore confirmed in a clustering task and in the downstream task of ontology completion.

Abstract (translated)

学习捕捉概念含义的向量仍然是一个基本挑战。也许有些令人惊讶,目前预训练语言模型只实现了这种概念嵌入质量的适度改善。目前使用语言模型的策略通常通过平均一些语料库中涉及概念的上下文嵌入表示来代表一个概念。这可能因为至少两个原因而不可取。第一,上下文嵌入向量具有一种独特的几何形状,可能影响后续任务。第二,概念嵌入应该捕捉概念的语义特性,而上下文嵌入向量也受到其他因素的影响。为了解决这些问题,我们提出了两个对比学习策略,基于观点,每当两个句子表现出类似的特性时,对应的上下文嵌入向量也应该表现出类似的特性。一种策略是完全无监督的,从上下文嵌入向量的邻居结构中估计在句子中表达的性质的量。第二种策略则依赖于ConceptNet 从远处监督信号。我们的实验结果显示, resulting 向量在预测概念的语义特性方面大大优于现有的概念嵌入向量,而基于ConceptNet的策略则取得了最佳结果。这些发现在聚类任务和本体完成下游任务中也得到了确认。

URL

https://arxiv.org/abs/2305.09785

PDF

https://arxiv.org/pdf/2305.09785.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot