Paper Reading AI Learner

Lost in Variation? Evaluating NLI Performance in Basque and Spanish Geographical Variants

2025-06-18 08:20:19
Jaione Bengoetxea, Itziar Gonzalez-Dios, Rodrigo Agerri

Abstract

In this paper, we evaluate the capacity of current language technologies to understand Basque and Spanish language varieties. We use Natural Language Inference (NLI) as a pivot task and introduce a novel, manually-curated parallel dataset in Basque and Spanish, along with their respective variants. Our empirical analysis of crosslingual and in-context learning experiments using encoder-only and decoder-based Large Language Models (LLMs) shows a performance drop when handling linguistic variation, especially in Basque. Error analysis suggests that this decline is not due to lexical overlap, but rather to the linguistic variation itself. Further ablation experiments indicate that encoder-only models particularly struggle with Western Basque, which aligns with linguistic theory that identifies peripheral dialects (e.g., Western) as more distant from the standard. All data and code are publicly available.

Abstract (translated)

在这篇论文中,我们评估了当前语言技术理解巴斯克语和西班牙语方言的能力。我们使用自然语言推理(NLI)作为核心任务,并引入了一个新颖的手动整理的巴斯克语和西班牙语平行数据集,包括它们各自的变体。我们通过仅编码器模型和基于解码器的大规模语言模型(LLMs)进行跨语言和上下文学习实验的实证分析发现,在处理语言变异时性能下降,特别是在处理巴斯克语时尤为明显。错误分析表明,这种下滑并非由于词汇重叠,而是由语言变异本身引起的。进一步的消融实验显示,仅编码器模型在处理西巴斯克方言时特别困难,这与语言理论相吻合,该理论认为边缘方言(如西部方言)距离标准较远。所有数据和代码都是公开可用的。

URL

https://arxiv.org/abs/2506.15239

PDF

https://arxiv.org/pdf/2506.15239.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot