Abstract
Automatic legal text classification systems have been proposed in the literature to address knowledge extraction from judgments and detect their aspects. However, most of these systems are black boxes even when their models are interpretable. This may raise concerns about their trustworthiness. Accordingly, this work contributes with a system combining Natural Language Processing (NLP) with Machine Learning (ML) to classify legal texts in an explainable manner. We analyze the features involved in the decision and the threshold bifurcation values of the decision paths of tree structures and present this information to the users in natural language. This is the first work on automatic analysis of legal texts combining NLP and ML along with Explainable Artificial Intelligence techniques to automatically make the models' decisions understandable to end users. Furthermore, legal experts have validated our solution, and this knowledge has also been incorporated into the explanation process as "expert-in-the-loop" dictionaries. Experimental results on an annotated data set in law categories by jurisdiction demonstrate that our system yields competitive classification performance, with accuracy values well above 90%, and that its automatic explanations are easily understandable even to non-expert users.
Abstract (translated)
自动法律文本分类系统已经被文献中提出来解决从判决中提取知识和检测其方面。然而,即使这些系统的模型可以解释,它们仍然通常是黑盒子。这可能引起对它们可靠性的担忧。因此,本研究通过将自然语言处理(NLP)与机器学习(ML)相结合,以有解释性地对法律文本进行分类。我们分析了决策涉及的特征以及决策路径分叉值阈值的阈值,用自然语言向用户呈现这些信息。这是第一个将NLP和ML与可解释人工智能技术相结合,自动使模型决策对用户可解释的第一篇工作。此外,法律专家已经验证了我们的解决方案,并将这一知识纳入了解释过程作为"专家-在-循环"词典。通过对法律类别的带注释数据集的实验结果,证明了我们的系统具有竞争力的分类性能,准确值超过90%,并且自动解释对非专家用户来说也容易理解。
URL
https://arxiv.org/abs/2404.00437