Paper Reading AI Learner

Foundations and Evaluations in NLP

2025-04-02 04:14:03
Jungyeul Park

Abstract

This memoir explores two fundamental aspects of Natural Language Processing (NLP): the creation of linguistic resources and the evaluation of NLP system performance. Over the past decade, my work has focused on developing a morpheme-based annotation scheme for the Korean language that captures linguistic properties from morphology to semantics. This approach has achieved state-of-the-art results in various NLP tasks, including part-of-speech tagging, dependency parsing, and named entity recognition. Additionally, this work provides a comprehensive analysis of segmentation granularity and its critical impact on NLP system performance. In parallel with linguistic resource development, I have proposed a novel evaluation framework, the jp-algorithm, which introduces an alignment-based method to address challenges in preprocessing tasks like tokenization and sentence boundary detection (SBD). Traditional evaluation methods assume identical tokenization and sentence lengths between gold standards and system outputs, limiting their applicability to real-world data. The jp-algorithm overcomes these limitations, enabling robust end-to-end evaluations across a variety of NLP tasks. It enhances accuracy and flexibility by incorporating linear-time alignment while preserving the complexity of traditional evaluation metrics. This memoir provides key insights into the processing of morphologically rich languages, such as Korean, while offering a generalizable framework for evaluating diverse end-to-end NLP systems. My contributions lay the foundation for future developments, with broader implications for multilingual resource development and system evaluation.

Abstract (translated)

这部回忆录探讨了自然语言处理(NLP)的两个基本方面:语言资源的创建和评估NLP系统性能。在过去十年里,我的工作主要集中在为韩语开发一种基于音节的语言注释方案,该方案捕捉从形态学到语义的各种语言特性。这种方法在包括词性标注、依存句法分析和命名实体识别在内的各种NLP任务中实现了最先进的结果。此外,这项工作还提供了对分段粒度及其对NLP系统性能的批判性影响的全面分析。 与语言资源开发并行地,我还提出了一种新颖的评估框架——jp算法,该算法引入了一种基于对齐的方法来解决预处理任务(如标记化和句子边界检测)中的挑战。传统的评估方法假设黄金标准和系统输出之间的分词和句长是相同的,这限制了它们在现实世界数据上的适用性。jp算法克服了这些限制,使得能够在各种NLP任务中进行稳健的端到端评估。通过结合线性时间对齐并保留传统评估指标的复杂度,它增强了准确性和灵活性。 这部回忆录为丰富形态语言(如韩语)的处理提供了关键见解,并提供了一种通用化的框架来评价多种端到端NLP系统。我的贡献奠定了未来发展的基础,并且对未来多语言资源开发和系统评估具有更广泛的意义。

URL

https://arxiv.org/abs/2504.01342

PDF

https://arxiv.org/pdf/2504.01342.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot