Paper Reading AI Learner

Empowering Prior to Court Legal Analysis: A Transparent and Accessible Dataset for Defensive Statement Classification and Interpretation

2024-05-17 11:22:27
Yannis Spyridis, Jean-Paul, Haneen Deeb, Vasileios Argyriou

Abstract

The classification of statements provided by individuals during police interviews is a complex and significant task within the domain of natural language processing (NLP) and legal informatics. The lack of extensive domain-specific datasets raises challenges to the advancement of NLP methods in the field. This paper aims to address some of the present challenges by introducing a novel dataset tailored for classification of statements made during police interviews, prior to court proceedings. Utilising the curated dataset for training and evaluation, we introduce a fine-tuned DistilBERT model that achieves state-of-the-art performance in distinguishing truthful from deceptive statements. To enhance interpretability, we employ explainable artificial intelligence (XAI) methods to offer explainability through saliency maps, that interpret the model's decision-making process. Lastly, we present an XAI interface that empowers both legal professionals and non-specialists to interact with and benefit from our system. Our model achieves an accuracy of 86%, and is shown to outperform a custom transformer architecture in a comparative study. This holistic approach advances the accessibility, transparency, and effectiveness of statement analysis, with promising implications for both legal practice and research.

Abstract (translated)

在自然语言处理(NLP)和法律信息学领域,对犯罪嫌疑人在警讯中提供的陈述进行分类是一个复杂而重要的任务。缺乏广泛的领域特定数据集会挑战NLP方法在领域的发展。本文旨在通过引入一个针对警讯中陈述分类的新型数据集来解决一些现有挑战,该数据集在训练和评估过程中使用了经过挑选的数据集。通过训练和评估来微调预先训练的DistilBERT模型,该模型在区分真实陈述和虚假陈述方面实现了最先进的性能。为了提高可解释性,我们采用了解释性人工智能(XAI)方法,通过置信度图提供置信度,解释了模型的决策过程。最后,我们提出了一个XAI界面,使法律专业人员和非专业人士能够与我们系统互动并从中受益。我们的模型实现了86%的准确率,并在比较研究中证明了其优于自定义Transformer架构的性能。这种全面的方法推动了陈述分析的可用性、透明度和有效性,对法律实践和研究具有积极的意义。

URL

https://arxiv.org/abs/2405.10702

PDF

https://arxiv.org/pdf/2405.10702.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot