Paper Reading AI Learner

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum

2026-01-20 17:25:33
V\'ictor Yeste, Paolo Rosso

Abstract

We study sentence-level identification of the 19 values in the Schwartz motivational continuum as a concrete formulation of human value detection in text. The setting - out-of-context sentences from news and political manifestos - features sparse moral cues and severe class imbalance. This combination makes fine-grained sentence-level value detection intrinsically difficult, even for strong modern neural models. We first operationalize a binary moral presence task ("does any value appear?") and show that it is learnable from single sentences (positive-class F1 $\approx$ 0.74 with calibrated thresholds). We then compare a presence-gated hierarchy to a direct multi-label classifier under matched compute, both based on DeBERTa-base and augmented with lightweight signals (prior-sentence context, LIWC-22/eMFD/MJD lexica, and topic features). The hierarchy does not outperform direct prediction, indicating that gate recall limits downstream gains. We also benchmark instruction-tuned LLMs - Gemma 2 9B, Llama 3.1 8B, Mistral 8B, and Qwen 2.5 7B - in zero-/few-shot and QLoRA setups and build simple ensembles; a soft-vote supervised ensemble reaches macro-F1 0.332, significantly surpassing the best single supervised model and exceeding prior English-only baselines. Overall, in this scenario, lightweight signals and small ensembles yield the most reliable improvements, while hierarchical gating offers limited benefit. We argue that, under an 8 GB single-GPU constraint and at the 7-9B scale, carefully tuned supervised encoders remain a strong and compute-efficient baseline for structured human value detection, and we outline how richer value structure and sentence-in-document context could further improve performance.

Abstract (translated)

我们研究了施瓦茨动机连续体中的19种价值观在句子层面的识别,作为文本中人类价值检测的具体形式。该设置涉及来自新闻和政见声明的无上下文句子,这些句子包含稀疏的道德线索以及严重的类别不平衡问题。这种组合使得即使对于强大的现代神经模型,在句子级别的细粒度价值检测也变得非常困难。 我们首先将二元道德存在任务(“是否存在任何价值观?”)进行具体化,并展示了该任务可以从单个句子中学习(正类F1约等于0.74,经过校准的阈值)。然后,我们将带有轻量级信号(前句上下文、LIWC-22/eMFD/MJD词典和主题特征)增强的基础DeBERTA模型在同等计算资源下进行存在性过滤的层次结构与直接多标签分类器进行了比较。结果表明,层次结构并未超过直接预测效果,这说明门控召回限制了后续改进的可能性。 我们还对几种指令调优的大语言模型(Gemma 2 9B、Llama 3.1 8B、Mistral 8B 和 Qwen 2.5 7B)进行了零样本/少样本和QLoRA设置的基准测试,并构建了简单的集成系统。其中,通过软投票监督集成达到了0.332的宏观F1值,显著超过了最佳单一模型的表现并超越了之前仅限英语的基础线。 总体而言,在这种情况下,轻量级信号和小规模集成系统产生了最可靠的改进效果,而层次性门控提供的益处有限。我们认为,在8GB单GPU限制下以及在7-9B参数规模内,精心调优的监督编码器仍然是结构化人类价值观检测的强大且计算效率高的基准,并概述了如何通过更丰富的价值结构和上下文(句子在其文档中的位置)进一步提高性能的方法。

URL

https://arxiv.org/abs/2601.14172

PDF

https://arxiv.org/pdf/2601.14172.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot