Aspect-based sentiment analysis seeks to determine sentiment with a high level of detail. While graph convolutional networks (GCNs) are commonly used for extracting sentiment features, their straightforward use in syntactic feature extraction can lead to a loss of crucial information. This paper presents a novel edge-enhanced GCN, called EEGCN, which improves performance by preserving feature integrity as it processes syntactic graphs. We incorporate a bidirectional long short-term memory (Bi-LSTM) network alongside a self-attention-based transformer for effective text encoding, ensuring the retention of long-range dependencies. A bidirectional GCN (Bi-GCN) with message passing then captures the relationships between entities, while an aspect-specific masking technique removes extraneous information. Extensive evaluations and ablation studies on four benchmark datasets show that EEGCN significantly enhances aspect-based sentiment analysis, overcoming issues with syntactic feature extraction and advancing the field's methodologies.
基于方面的情感分析旨在以高度细节化的水平确定情感。虽然图卷积网络(GCN)常用于提取情感特征,但直接应用于句法特征提取可能会导致关键信息的丢失。本文提出了一种称为EEGCN的新颖边缘增强型GCN,在处理句法图时通过保持特征完整性来提高性能。我们结合了双向长短期记忆(Bi-LSTM)网络以及基于自注意力机制的Transformer模型,以有效地编码文本,并确保长期依赖关系的保留。接下来,采用带有信息传递功能的双向GCN(Bi-GCN),捕捉实体之间的关系;同时使用针对特定方面的屏蔽技术来去除冗余信息。在四个基准数据集上进行的广泛评估和消融研究表明,EEGCN显著提高了基于方面的文本情感分析水平,解决了句法特征提取的问题,并推动了该领域的研究方法向前发展。
https://arxiv.org/abs/2503.12803
Recent advancements of in-context learning (ICL) show language models can significantly improve their performance when demonstrations are provided. However, little attention has been paid to model calibration and prediction confidence of ICL in cross-lingual scenarios. To bridge this gap, we conduct a thorough analysis of ICL for cross-lingual sentiment classification. Our findings suggest that ICL performs poorly in cross-lingual scenarios, exhibiting low accuracy and presenting high calibration errors. In response, we propose a novel approach, N2C2, which employs a -nearest neighbors augmented classifier for prediction confidence calibration. N2C2 narrows the prediction gap by leveraging a datastore of cached few-shot instances. Specifically, N2C2 integrates the predictions from the datastore and incorporates confidence-aware distribution, semantically consistent retrieval representation, and adaptive neighbor combination modules to effectively utilize the limited number of supporting instances. Evaluation on two multilingual sentiment classification datasets demonstrates that N2C2 outperforms traditional ICL. It surpasses fine tuning, prompt tuning and recent state-of-the-art methods in terms of accuracy and calibration errors.
最近的上下文学习(ICL)进展表明,当提供示例时,语言模型可以显著提高其性能。然而,在跨语言场景中,人们对ICL的模型校准和预测置信度关注较少。为了填补这一空白,我们对ICL在跨语言情感分类中的应用进行了全面分析。我们的研究发现表明,在跨语言场景下,ICL表现不佳,准确率低且存在较高的校准误差。为此,我们提出了一种新颖的方法N2C2,该方法采用一种增强的-k近邻分类器来调整预测置信度。N2C2通过利用存储库中缓存的少量示例数据集来缩小预测差距。具体而言,N2C2将来自数据存储库的预测与置信度感知分布、语义一致检索表示以及自适应邻居组合模块相结合,以有效利用有限的支持实例数量。 在两个多语言情感分类数据集中进行评估后发现,N2C2优于传统的ICL方法,并且在准确率和校准误差方面超过了微调、提示调整以及最近的最先进的方法。
https://arxiv.org/abs/2503.09218
This work proposes an LSTM-based sentiment classification model with multi-head attention mechanism and TF-IDF optimization. Through the integration of TF-IDF feature extraction and multi-head attention, the model significantly improves text sentiment analysis performance. Experimental results on public data sets demonstrate that the new method achieves substantial improvements in the most critical metrics like accuracy, recall, and F1-score compared to baseline models. Specifically, the model achieves an accuracy of 80.28% on the test set, which is improved by about 12% in comparison with standard LSTM models. Ablation experiments also support the necessity and necessity of all modules, in which the impact of multi-head attention is greatest to performance improvement. This research provides a proper approach to sentiment analysis, which can be utilized in public opinion monitoring, product recommendation, etc.
这项工作提出了一种基于LSTM的情感分类模型,该模型集成了多头注意力机制和TF-IDF优化。通过将TF-IDF特征提取与多头注意力机制相结合,该模型在文本情感分析的性能上有了显著提升。实验结果表明,在公开数据集中,新方法相比基准模型,在关键指标如准确率、召回率以及F1值方面取得了实质性的改进。具体而言,模型在测试集上的准确率为80.28%,相较于标准LSTM模型提高了约12%。消融实验也支持了各模块的必要性,其中多头注意力机制对性能提升的影响最大。这项研究为情感分析提供了一种有效的方法,并可应用于公共舆论监测、产品推荐等领域。
https://arxiv.org/abs/2503.08079
Implicit sentiment analysis aims to uncover emotions that are subtly expressed, often obscured by ambiguity and figurative language. To accomplish this task, large language models and multi-step reasoning are needed to identify those sentiments that are not explicitly stated. In this study, we propose a novel Dual Reverse Chain Reasoning (DRCR) framework to enhance the performance of implicit sentiment analysis. Inspired by deductive reasoning, the framework consists of three key steps: 1) hypothesize an emotional polarity and derive a reasoning process, 2) negate the initial hypothesis and derive a new reasoning process, and 3) contrast the two reasoning paths to deduce the final sentiment polarity. Building on this, we also introduce a Triple Reverse Chain Reasoning (TRCR) framework to address the limitations of random hypotheses. Both methods combine contrastive mechanisms and multi-step reasoning, significantly improving the accuracy of implicit sentiment classification. Experimental results demonstrate that both approaches outperform existing methods across various model scales, achieving state-of-the-art performance. This validates the effectiveness of combining contrastive reasoning and multi-step reasoning for implicit sentiment analysis.
隐式情感分析旨在揭示那些微妙表达的情感,这些情感往往被歧义和比喻语言所掩盖。为了完成这一任务,需要大型语言模型和多步推理来识别那些未明确表述的情感。在这项研究中,我们提出了一种新颖的双反向链推理(DRCR)框架,以增强隐式情感分析的表现。受演绎推理启发,该框架包含三个关键步骤:1) 假设一种情感极性并推导出一个推理过程;2) 否定最初的假设,并推导出一个新的推理过程;3) 对比这两种推理路径,得出最终的情感极性。在此基础上,我们还引入了三重反向链推理(TRCR)框架来解决随机假设的局限性。两种方法均结合对比机制和多步推理,显著提高了隐式情感分类的准确性。实验结果表明,两种方法在各种模型规模上都优于现有方法,并达到了最先进的性能水平。这验证了将对比推理与多步骤推理相结合对于隐式情感分析的有效性。
https://arxiv.org/abs/2503.07140
The rapid proliferation of the Internet and the widespread adoption of social networks have significantly accelerated information dissemination. However, this transformation has introduced complexities in information capture and processing, posing substantial challenges for researchers and practitioners. Predicting the dissemination of topic-related information within social networks has thus become a critical research focus. This paper proposes a predictive model for topic dissemination in social networks by integrating multidimensional features derived from key dissemination characteristics. Specifically, we introduce two novel indicators, user relationship breadth and user authority, into the PageRank algorithm to quantify user influence more effectively. Additionally, we employ a Text-CNN model for sentiment classification, extracting sentiment features from textual content. Temporal embeddings of nodes are encoded using a Bi-LSTM model to capture temporal dynamics. Furthermore, we refine the measurement of user interaction traces with topics, replacing traditional topic view metrics with a more precise communication characteristics measure. Finally, we integrate the extracted multidimensional features using a Transformer model, significantly enhancing predictive performance. Experimental results demonstrate that our proposed model outperforms traditional machine learning and unimodal deep learning models in terms of FI-Score, AUC, and Recall, validating its effectiveness in predicting topic propagation within social networks.
互联网的迅速普及和社会网络的广泛应用大大加速了信息传播。然而,这种转变带来了信息采集和处理的复杂性,为研究人员和从业人员提出了重大挑战。因此,在社会网络中预测主题相关信息的传播已成为一项重要的研究焦点。 本文提出了一种基于多维特征的社会网络话题扩散预测模型,这些特征源于关键的话题扩散特性。具体而言,我们引入了两个新颖的指标——用户关系宽度和用户权威性,并将其整合到PageRank算法中,以更有效地量化用户的影响力。此外,我们采用Text-CNN模型进行情感分类,从文本内容中提取情感特征。节点的时间嵌入通过Bi-LSTM模型编码,用于捕捉时间动态变化。另外,我们改进了用户与话题互动痕迹的测量方法,用更为精确的话题沟通特性衡量指标替代传统的主题浏览度量。最后,我们将提取出的多维特征整合到一个Transformer模型中,显著提高了预测性能。 实验结果表明,在FI-Score、AUC和召回率等评价指标上,我们的模型优于传统机器学习及单一模式深度学习模型,验证了其在社会网络话题传播预测中的有效性。
https://arxiv.org/abs/2503.03112
Model pruning technique is vital for accelerating large language models by reducing their size and computational requirements. However, the generalizability of existing pruning methods across diverse datasets and tasks remains unclear. Thus, we conduct extensive evaluations on 24 datasets and 4 tasks using popular pruning methods. Based on these evaluations, we find and then investigate that calibration set greatly affect the performance of pruning methods. In addition, we surprisingly find a significant performance drop of existing pruning methods in sentiment classification tasks. To understand the link between performance drop and pruned neurons, we propose Neuron Semantic Attribution, which learns to associate each neuron with specific semantics. This method first makes the unpruned neurons of LLMs explainable.
模型剪枝技术对于通过减少大型语言模型的规模和计算需求来加速这些模型至关重要。然而,现有的剪枝方法在不同数据集和任务中的泛化能力尚不清楚。因此,我们使用流行的方法对24个数据集和4种任务进行了广泛的评估。基于这些评估结果,我们发现并调查了校准集(calibration set)极大地影响了剪枝方法的性能。此外,我们惊讶地发现了现有剪枝方法在情感分类任务中的显著性能下降。为了理解性能下降与被剪枝神经元之间的联系,我们提出了“神经语义归属”(Neuron Semantic Attribution),该方法旨在将每个神经元与特定语义关联起来。这种方法首先使大型语言模型中未被剪枝的神经元变得可解释。
https://arxiv.org/abs/2503.01542
Sentiment Analysis (SA) is instrumental in understanding peoples viewpoints facilitating social media monitoring recognizing products and brands and gauging customer satisfaction. Consequently SA has evolved into an active research domain within Natural Language Processing (NLP). Many approaches outlined in the literature devise intricate frameworks aimed at achieving high accuracy, focusing exclusively on either binary sentiment classification or fine-grained sentiment classification. In this paper our objective is to fine-tune the pre-trained BERT model with Bidirectional LSTM (BiLSTM) to enhance both binary and fine-grained SA specifically for movie reviews. Our approach involves conducting sentiment classification for each review followed by computing the overall sentiment polarity across all reviews. We present our findings on binary classification as well as fine-grained classification utilizing benchmark datasets. Additionally we implement and assess two accuracy improvement techniques Synthetic Minority Oversampling Technique (SMOTE) and NLP Augmenter (NLPAUG) to bolster the models generalization in fine-grained sentiment classification. Finally a heuristic algorithm is employed to calculate the overall polarity of predicted reviews from the BERT+BiLSTM output vector. Our approach performs comparably with state-of-the-art (SOTA) techniques in both classifications. For instance in binary classification we achieve 97.67% accuracy surpassing the leading SOTA model NB-weighted-BON+dv-cosine by 0.27% on the renowned IMDb dataset. Conversely for five-class classification on SST-5 while the top SOTA model RoBERTa+large+Self-explaining attains 55.5% accuracy our model achieves 59.48% accuracy surpassing the BERT-large baseline by 3.6%.
情感分析(SA)在理解人们的观点、促进社交媒体监控、识别产品和品牌以及衡量客户满意度方面发挥着重要作用。因此,SA已经成为自然语言处理(NLP)领域内的一个活跃研究方向。文献中提出的许多方法设计了复杂的框架,专注于实现高准确性的二元情感分类或细粒度的情感分类。本文的目标是通过对预训练的BERT模型进行微调,增强其与双向长短时记忆网络(BiLSTM)相结合的能力,以提高电影评论中的二元和细粒度情感分析的效果。我们的方法包括对每个评论进行情感分类,并计算所有评论的整体情感倾向。我们利用基准数据集展示了我们在二元分类以及细粒度分类方面的发现。此外,我们实现了并评估了两种准确性改进技术:合成少数类过采样技术(SMOTE)和自然语言处理增强器(NLPAUG),以提高模型在细粒度情感分析中的泛化能力。最后,我们采用了一种启发式算法来计算BERT+BiLSTM输出向量预测评论的整体倾向性。 我们的方法在这两种分类中都表现出与最先进(SOTA)技术相当的性能。例如,在二元分类中,我们在著名的IMDb数据集上实现了97.67%的准确率,超过了领先的SOTA模型NB-weighted-BON+dv-cosine的0.27%。相比之下,在SST-5上的五类分类任务中,虽然最优秀的SOTA模型RoBERTa+large+Self-explaining达到了55.5%的准确率,但我们的模型实现了59.48%的准确率,超越了BERT-large基线3.6%。 这个研究展示了一种有效的方法来提高电影评论的情感分析性能,并为NLP领域的进一步研究提供了有价值的见解。
https://arxiv.org/abs/2502.20682
Due to the lack of quality data for low-resource Bantu languages, significant challenges are presented in text classification and other practical implementations. In this paper, we introduce an advanced model combining Language-Independent Data Augmentation (LiDA) with Multi-Head Attention based weighted embeddings to selectively enhance critical data points and improve text classification performance. This integration allows us to create robust data augmentation strategies that are effective across various linguistic contexts, ensuring that our model can handle the unique syntactic and semantic features of Bantu languages. This approach not only addresses the data scarcity issue but also sets a foundation for future research in low-resource language processing and classification tasks.
由于缺乏高质量的数据,低资源班图语的语言文本分类和其他实际应用面临重大挑战。在本文中,我们引入了一种结合语言无关数据增强(LiDA)与基于多头注意力机制的加权嵌入的高级模型。这种组合允许我们选择性地强化关键数据点,并提高文本分类性能。该集成使得我们可以创建有效的数据增强策略,适用于各种语言环境,确保我们的模型能够处理班图语独特的语法和语义特征。这一方法不仅解决了数据稀缺的问题,还为低资源语言处理和分类任务的未来研究奠定了基础。
https://arxiv.org/abs/2502.17987
Large Language Models revolutionized NLP and showed dramatic performance improvements across several tasks. In this paper, we investigated the role of such language models in text classification and how they compare with other approaches relying on smaller pre-trained language models. Considering 32 datasets spanning 8 languages, we compared zero-shot classification, few-shot fine-tuning and synthetic data based classifiers with classifiers built using the complete human labeled dataset. Our results show that zero-shot approaches do well for sentiment classification, but are outperformed by other approaches for the rest of the tasks, and synthetic data sourced from multiple LLMs can build better classifiers than zero-shot open LLMs. We also see wide performance disparities across languages in all the classification scenarios. We expect that these findings would guide practitioners working on developing text classification systems across languages.
大型语言模型在自然语言处理(NLP)领域引发了革命,展现了多项任务上的显著性能提升。在这篇论文中,我们研究了此类语言模型在文本分类中的作用,并将其与依赖于较小预训练语言模型的其他方法进行了比较。考虑到涵盖8种语言的32个数据集,我们将零样本分类、少量样本微调和基于合成数据的分类器与使用完整人工标注数据集构建的分类器进行对比。我们的结果显示,对于情感分类,零样本方法表现良好;但对于其余任务,则被其他方法超越。此外,来自多个大型语言模型(LLM)的合成数据可以建立比零样本开放语言模型更好的分类器。在所有文本分类场景中,我们还观察到了跨不同语言间的广泛性能差异。我们认为这些发现将有助于指导在多语言环境下开发文本分类系统的研究人员和实践者。
https://arxiv.org/abs/2502.11830
Large language models (LLMs) struggle with compositional generalisation, limiting their ability to systematically combine learned components to interpret novel inputs. While architectural modifications, fine-tuning, and data augmentation improve compositionality, they often have limited adaptability, face scalability constraints, or yield diminishing returns on real data. To address this, we propose CARMA, an intervention that enhances the stability and robustness of compositional reasoning in LLMs while preserving fine-tuned performance. CARMA employs mutual information regularisation and layer-wise stability constraints to mitigate feature fragmentation, ensuring structured representations persist across and within layers. We evaluate CARMA on inverse dictionary modelling and sentiment classification, measuring its impact on semantic consistency, performance stability, and robustness to lexical perturbations. Results show that CARMA reduces the variability introduced by fine-tuning, stabilises token representations, and improves compositional reasoning. While its effectiveness varies across architectures, CARMA's key strength lies in reinforcing learned structures rather than introducing new capabilities, making it a scalable auxiliary method. These findings suggest that integrating CARMA with fine-tuning can improve compositional generalisation while maintaining task-specific performance in LLMs.
大型语言模型(LLMs)在组合泛化方面存在困难,限制了它们系统地结合已学组件来解释新输入的能力。尽管架构修改、微调和数据增强可以改善组合性,但这些方法通常适应性有限、面临规模扩展的约束或在实际数据上收益递减。为解决这一问题,我们提出了CARMA干预措施,旨在提高大型语言模型中组合推理的稳定性和鲁棒性,同时保持微调后的性能。 CARMA采用互信息正则化和逐层稳定性约束来缓解特征碎片化现象,确保跨层和内部层次结构化的表示持续存在。我们在逆词典建模和情感分类任务上评估了CARMA的效果,并对其对语义一致性、性能稳定性和词汇扰动鲁棒性的效果进行了测量。结果显示,CARMA减少了微调引入的变异性,稳定了令牌表示,并提高了组合推理能力。尽管其有效性在不同的架构中有所不同,但CARMA的关键优势在于强化已学结构而不是引入新的功能,使其成为一种可扩展的支持性方法。 这些发现表明,在大型语言模型中结合使用CARMA和微调可以改进组合泛化同时保持特定任务的性能。
https://arxiv.org/abs/2502.11066
Distinguishing in- and out-of-distribution (OOD) inputs is crucial for reliable deployment of classification systems. However, OOD data is typically unavailable or difficult to collect, posing a significant challenge for accurate OOD detection. In this work, we present a method that harnesses the generative capabilities of Large Language Models (LLMs) to create high-quality synthetic OOD proxies, eliminating the dependency on any external OOD data source. We study the efficacy of our method on classical text classification tasks such as toxicity detection and sentiment classification as well as classification tasks arising in LLM development and deployment, such as training a reward model for RLHF and detecting misaligned generations. Extensive experiments on nine InD-OOD dataset pairs and various model sizes show that our approach dramatically lowers false positive rates (achieving a perfect zero in some cases) while maintaining high accuracy on in-distribution tasks, outperforming baseline methods by a significant margin.
区分内分布(In-Distribution,ID)和外分布(Out-of-Distribution,OOD)输入对于分类系统的可靠部署至关重要。然而,OOD数据通常难以获取或收集,这给准确的OOD检测带来了重大挑战。在这项工作中,我们提出了一种方法,该方法利用大规模语言模型(LLMs)的生成能力来创建高质量的合成OOD代理,从而消除了对任何外部OOD数据源的依赖。我们在经典文本分类任务(如毒性检测和情感分类)以及在LLM开发和部署中出现的分类任务(例如为RLHF训练奖励模型和检测不一致生成)上研究了我们方法的有效性。在九组InD-OOD数据集对及各种规模模型上的大量实验表明,我们的方法显著降低了假阳性率(在某些情况下达到了完美的零),同时保持了高准确度的内分布任务表现,大幅超越了基线方法。
https://arxiv.org/abs/2502.03323
We present LLaVAC, a method for constructing a classifier for multimodal sentiment analysis. This method leverages fine-tuning of the Large Language and Vision Assistant (LLaVA) to predict sentiment labels across both image and text modalities. Our approach involves designing a structured prompt that incorporates both unimodal and multimodal labels to fine-tune LLaVA, enabling it to perform sentiment classification effectively. Experiments on the MVSA-Single dataset demonstrate that LLaVAC outperforms existing methods in multimodal sentiment analysis across three data processing procedures. The implementation of LLaVAC is publicly available at this https URL.
我们介绍了LLaVAC方法,这是一种用于构建多模态情感分析分类器的方法。该方法利用大型语言和视觉助手(LLaVA)的微调来预测图像和文本两种模式的情感标签。我们的方法涉及设计一种结构化的提示,将单模态和多模态标签结合起来以对LLaVA进行微调,从而使它能够有效地执行情感分类任务。在MVSA-Single数据集上的实验表明,在三种不同的数据处理流程中,LLaVAC的性能超过了现有的多模态情感分析方法。LLaVAC的实现代码可在以下链接公开获取:[此URL](请将“此URL”替换为实际提供的GitHub或代码托管网站链接)。
https://arxiv.org/abs/2502.02938
With the internet's evolution, consumers increasingly rely on online reviews for service or product choices, necessitating that businesses analyze extensive customer feedback to enhance their offerings. While machine learning-based sentiment classification shows promise in this realm, its technical complexity often bars small businesses and individuals from leveraging such advancements, which may end up making the competitive gap between small and large businesses even bigger in terms of improving customer satisfaction. This paper introduces an approach that integrates large language models (LLMs), specifically Generative Pre-trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT)-based models, making it accessible to a wider audience. Our experiments across various datasets confirm that our approach retains high classification accuracy without the need for manual labeling, expert knowledge in tuning and data annotation, or substantial computational power. By significantly lowering the barriers to applying sentiment classification techniques, our methodology enhances competitiveness and paves the way for making machine learning technology accessible to a broader audience.
随着互联网的演进,消费者越来越依赖在线评论来做出服务或产品选择,这促使企业需要分析大量的客户反馈以改进其提供的产品和服务。虽然基于机器学习的情感分类技术在此领域展现出巨大潜力,但其复杂的技术要求通常使小型企业和个人难以利用这些进步,从而可能加大了在提升顾客满意度方面的小型企业与大型企业的竞争差距。 本文介绍了一种结合大规模语言模型(LLMs)的方法,特别是生成预训练变换器(GPT)和基于变压器双向编码表示(BERT)的模型。这种方法使得情感分类技术更加易于广大用户使用。通过不同数据集进行的实验表明,我们的方法在不需要手动标注、专业知识调优及大量计算能力的情况下仍能保持高精度的情感分类。 该方法大大降低了应用情感分类技术的门槛,从而增强了小型企业的竞争力,并为使机器学习技术更为广泛地应用于各种规模的企业铺平了道路。
https://arxiv.org/abs/2502.02893
In 2024, the outbreak of Human Metapneumovirus (HMPV) in China, which later spread to the UK and other countries, raised significant public concern. While HMPV typically causes mild symptoms, its effects on vulnerable individuals prompted health authorities to emphasize preventive measures. This paper explores how sentiment analysis can enhance our understanding of public reactions to HMPV by analyzing social media data. We apply transformer models, particularly XLNet, achieving 93.50% accuracy in sentiment classification. Additionally, we use explainable AI (XAI) through SHAP to improve model transparency.
2024年,中国爆发的人间元病毒(HMPV)疫情随后蔓延至英国及其他国家,引发了公众的广泛关注。尽管HMPV通常只会引起轻微的症状,但其对易感人群的影响促使卫生当局强调预防措施的重要性。本文探讨了如何通过分析社交媒体数据来利用情感分析增强我们对于公众对HMPV反应的理解。我们在这一研究中应用了变压器模型,特别是XLNet,在情感分类上达到了93.50%的准确率。此外,我们还采用了基于SHAP的可解释人工智能(XAI)技术以提高模型的透明度。
https://arxiv.org/abs/2502.01663
Sentiment analysis of patient feedback from the public health domain can aid decision makers in evaluating the provided services. The current paper focuses on free-text comments in patient surveys about general practitioners and psychiatric healthcare, annotated with four sentence-level polarity classes -- positive, negative, mixed and neutral -- while also attempting to alleviate data scarcity by leveraging general-domain sources in the form of reviews. For several different architectures, we compare in-domain and out-of-domain effects, as well as the effects of training joint multi-domain models.
来自公共卫生领域的患者反馈的情感分析可以帮助决策者评估所提供的服务。本文重点关注的是普通内科医生和精神卫生护理领域中患者调查中的自由文本评论,这些评论被标注为四种句子级别的极性类别——正面、负面、混合及中立,并尝试通过利用通用领域来源(如评论)来缓解数据稀缺问题。对于几种不同的架构,我们比较了域内和域外的效果,以及训练联合多领域模型的影响。
https://arxiv.org/abs/2501.19134
This study explores transformer-based models such as BERT, mBERT, and XLM-R for multi-lingual sentiment analysis across diverse linguistic structures. Key contributions include the identification of XLM-R superior adaptability in morphologically complex languages, achieving accuracy levels above 88%. The work highlights fine-tuning strategies and emphasizes their significance for improving sentiment classification in underrepresented languages.
这项研究探讨了基于变压器的模型(如BERT、mBERT和XLM-R)在具有不同语言结构的多语种情感分析中的应用。主要贡献包括识别出XLM-R在形态复杂的语言中表现出色的适应能力,其准确率超过了88%。该工作还强调了微调策略的重要性,并指出这些策略对于改善欠代表性语言的情感分类有着重要意义。
https://arxiv.org/abs/2501.12540
Aspect-based sentiment analysis (ASBA) is a refined approach to sentiment analysis that aims to extract and classify sentiments based on specific aspects or features of a product, service, or entity. Unlike traditional sentiment analysis, which assigns a general sentiment score to entire reviews or texts, ABSA focuses on breaking down the text into individual components or aspects (e.g., quality, price, service) and evaluating the sentiment towards each. This allows for a more granular level of understanding of customer opinions, enabling businesses to pinpoint specific areas of strength and improvement. The process involves several key steps, including aspect extraction, sentiment classification, and aspect-level sentiment aggregation for a review paragraph or any other form that the users have provided. ABSA has significant applications in areas such as product reviews, social media monitoring, customer feedback analysis, and market research. By leveraging techniques from natural language processing (NLP) and machine learning, ABSA facilitates the extraction of valuable insights, enabling companies to make data-driven decisions that enhance customer satisfaction and optimize offerings. As ABSA evolves, it holds the potential to greatly improve personalized customer experiences by providing a deeper understanding of sentiment across various product aspects. In this work, we have analyzed the strength of LLMs for a complete cross-domain aspect-based sentiment analysis with the aim of defining the framework for certain products and using it for other similar situations. We argue that it is possible to that at an effectiveness of 92\% accuracy for the Aspect Based Sentiment Analysis dataset of SemEval-2015 Task 12.
基于方面的情感分析(ASBA)是一种细化的情感分析方法,旨在根据产品、服务或实体的特定方面或特征来提取和分类情感。与传统的整体评论或文本通用情感评分的方法不同,ASBA专注于将文本分解为各个组成部分或方面(例如质量、价格、服务),并对每个方面的感受进行评估。这种方法使企业能够更细致地了解客户的意见,并确定具体的优势和改进领域。该过程包括几个关键步骤:方面提取、情感分类以及对评论段落或其他用户提供的形式的方面层面的情感聚合。 ASBA在产品评价、社交媒体监控、顾客反馈分析及市场研究等领域具有广泛的应用价值。通过利用自然语言处理(NLP)和机器学习技术,ASBA能够抽取有价值的见解,使公司能做出基于数据的决策以提升客户满意度并优化提供服务。随着ASBA的发展,它有望通过更深入地理解不同产品方面的情感来大幅提高个性化客户体验。 在这项工作中,我们分析了大型语言模型(LLM)在跨领域方面情感分析中的强度,并旨在为某些产品定义框架,同时将此应用于类似情况。我们认为,有可能达到针对SemEval-2015 Task 12的基于方面的语义情感分析数据集的92%准确率。 该段落强调了ASBA方法的重要性及其在不同领域的应用潜力,并提出了一种通过大型语言模型实现跨领域ASBA的方法框架。此外,还提出了一个目标,在特定的数据集中达到高精度的结果(即92%)。
https://arxiv.org/abs/2501.08974
Sentiment analysis is one of the most crucial tasks in Natural Language Processing (NLP), involving the training of machine learning models to classify text based on the polarity of opinions. Pre-trained Language Models (PLMs) can be applied to downstream tasks through fine-tuning, eliminating the need to train the model from scratch. Specifically, PLMs have been employed for Sentiment Analysis, a process that involves detecting, analyzing, and extracting the polarity of text sentiments. Numerous models have been proposed to address this task, with pre-trained PhoBERT-V2 models standing out as the state-of-the-art language models for Vietnamese. The PhoBERT-V2 pre-training approach is based on RoBERTa, optimizing the BERT pre-training method for more robust performance. In this paper, we introduce a novel approach that combines PhoBERT-V2 and SentiWordnet for Sentiment Analysis of Vietnamese reviews. Our proposed model utilizes PhoBERT-V2 for Vietnamese, offering a robust optimization for the prominent BERT model in the context of Vietnamese language, and leverages SentiWordNet, a lexical resource explicitly designed to support sentiment classification applications. Experimental results on the VLSP 2016 and AIVIVN 2019 datasets demonstrate that our sentiment analysis system has achieved excellent performance in comparison to other models.
情感分析是自然语言处理(NLP)中最关键的任务之一,涉及通过训练机器学习模型来根据意见的极性对文本进行分类。预训练的语言模型(PLM)可以通过微调应用于下游任务,从而无需从头开始重新训练模型。具体来说,这些PLMs已被用于情感分析过程,该过程包括检测、分析和提取文本情绪的极性。已经提出了多种模型来解决这一任务,其中基于RoBERTa优化了BERT预训练方法的PhoBERT-V2预训练方法脱颖而出,成为越南语最先进的语言模型。在这篇论文中,我们介绍了一种结合使用PhoBERT-V2和SentiWordnet进行越南评论情感分析的新颖方法。我们的提议模型利用了针对越南语进行了强大优化的PhoBERT-V2,并借鉴了专门为支持情感分类应用而设计的词典资源SentiWordnet。在VLSP 2016和AIVIVN 2019数据集上的实验结果表明,我们的情感分析系统与其他模型相比取得了卓越的成绩。 这一段文本概述了一个基于PhoBERT-V2和SentiWordnet的新情感分析方法,并强调了该系统的性能优势。
https://arxiv.org/abs/2501.08758
This paper explores the development of a multimodal sentiment analysis model that integrates text, audio, and visual data to enhance sentiment classification. The goal is to improve emotion detection by capturing the complex interactions between these modalities, thereby enabling more accurate and nuanced sentiment interpretation. The study evaluates three feature fusion strategies -- late stage fusion, early stage fusion, and multi-headed attention -- within a transformer-based architecture. Experiments were conducted using the CMU-MOSEI dataset, which includes synchronized text, audio, and visual inputs labeled with sentiment scores. Results show that early stage fusion significantly outperforms late stage fusion, achieving an accuracy of 71.87\%, while the multi-headed attention approach offers marginal improvement, reaching 72.39\%. The findings suggest that integrating modalities early in the process enhances sentiment classification, while attention mechanisms may have limited impact within the current framework. Future work will focus on refining feature fusion techniques, incorporating temporal data, and exploring dynamic feature weighting to further improve model performance.
本文探讨了一种多模态情感分析模型的发展,该模型结合了文本、音频和视觉数据以增强情感分类。目标是通过捕捉这些模式之间的复杂交互来提高情绪检测的准确性,并实现更准确和细致的情感解读。研究评估了三种特征融合策略——晚期融合、早期融合和多头注意力机制——在基于变压器架构中的表现。实验使用CMU-MOSEI数据集进行,该数据集包括同步文本、音频和视觉输入,并附有情感评分标签。 实验结果显示,早期融合显著优于晚期融合,在准确性上达到了71.87%,而多头注意机制则提供了边际改进,准确率达到72.39%。研究发现表明,将模式在处理过程中尽早整合可以提升情感分类的效果,而在当前框架下注意力机制的影响有限。未来的研究将重点放在完善特征融合技术、纳入时间数据以及探索动态特征加权上,以进一步提高模型性能。
https://arxiv.org/abs/2501.08085
Understanding emotions in videos is a challenging task. However, videos contain several modalities which make them a rich source of data for machine learning and deep learning tasks. In this work, we aim to improve video sentiment classification by focusing on two key aspects: the video itself, the accompanying text, and the acoustic features. To address the limitations of relying on large labeled datasets, we are developing a method that utilizes clustering-based semi-supervised pre-training to extract meaningful representations from the data. This pre-training step identifies patterns in the video and text data, allowing the model to learn underlying structures and relationships without requiring extensive labeled information at the outset. Once these patterns are established, we fine-tune the system in a supervised manner to classify the sentiment expressed in videos. We believe that this multi-modal approach, combining clustering with supervised fine-tuning, will lead to more accurate and insightful sentiment classification, especially in cases where labeled data is limited.
理解视频中的情感是一项具有挑战性的任务。然而,由于视频包含了多种模态(如视觉和音频信息),这使得它们成为机器学习和深度学习任务中数据丰富的来源。在这项工作中,我们旨在通过关注两个关键方面来改进视频情绪分类:视频本身、伴随的文本以及声学特征。 为了克服依赖大规模标注数据集的局限性,我们正在开发一种基于聚类的半监督预训练方法,这种方法可以从数据中提取有意义的表示形式。在预训练阶段,该方法能够识别视频和文本数据中的模式,从而使模型能够在初始阶段不需要大量标注信息的情况下学习到潜在结构与关系。 一旦建立了这些模式,我们将通过有监督的方式对系统进行微调以分类视频表达的情绪。我们相信,这种结合了聚类技术与有监督微调的多模态方法将能够实现更准确且深入的情感分类,尤其是在标注数据有限的情况下更是如此。
https://arxiv.org/abs/2501.06475