Abstract
This memoir explores two fundamental aspects of Natural Language Processing (NLP): the creation of linguistic resources and the evaluation of NLP system performance. Over the past decade, my work has focused on developing a morpheme-based annotation scheme for the Korean language that captures linguistic properties from morphology to semantics. This approach has achieved state-of-the-art results in various NLP tasks, including part-of-speech tagging, dependency parsing, and named entity recognition. Additionally, this work provides a comprehensive analysis of segmentation granularity and its critical impact on NLP system performance. In parallel with linguistic resource development, I have proposed a novel evaluation framework, the jp-algorithm, which introduces an alignment-based method to address challenges in preprocessing tasks like tokenization and sentence boundary detection (SBD). Traditional evaluation methods assume identical tokenization and sentence lengths between gold standards and system outputs, limiting their applicability to real-world data. The jp-algorithm overcomes these limitations, enabling robust end-to-end evaluations across a variety of NLP tasks. It enhances accuracy and flexibility by incorporating linear-time alignment while preserving the complexity of traditional evaluation metrics. This memoir provides key insights into the processing of morphologically rich languages, such as Korean, while offering a generalizable framework for evaluating diverse end-to-end NLP systems. My contributions lay the foundation for future developments, with broader implications for multilingual resource development and system evaluation.
Abstract (translated)
这部回忆录探讨了自然语言处理(NLP)的两个基本方面:语言资源的创建和评估NLP系统性能。在过去十年里,我的工作主要集中在为韩语开发一种基于音节的语言注释方案,该方案捕捉从形态学到语义的各种语言特性。这种方法在包括词性标注、依存句法分析和命名实体识别在内的各种NLP任务中实现了最先进的结果。此外,这项工作还提供了对分段粒度及其对NLP系统性能的批判性影响的全面分析。 与语言资源开发并行地,我还提出了一种新颖的评估框架——jp算法,该算法引入了一种基于对齐的方法来解决预处理任务(如标记化和句子边界检测)中的挑战。传统的评估方法假设黄金标准和系统输出之间的分词和句长是相同的,这限制了它们在现实世界数据上的适用性。jp算法克服了这些限制,使得能够在各种NLP任务中进行稳健的端到端评估。通过结合线性时间对齐并保留传统评估指标的复杂度,它增强了准确性和灵活性。 这部回忆录为丰富形态语言(如韩语)的处理提供了关键见解,并提供了一种通用化的框架来评价多种端到端NLP系统。我的贡献奠定了未来发展的基础,并且对未来多语言资源开发和系统评估具有更广泛的意义。
URL
https://arxiv.org/abs/2504.01342