Paper Reading AI Learner

LegalPro-BERT: Classification of Legal Provisions by fine-tuning BERT Large Language Model

2024-04-15 19:08:48
Amit Tewari

Abstract

A contract is a type of legal document commonly used in organizations. Contract review is an integral and repetitive process to avoid business risk and liability. Contract analysis requires the identification and classification of key provisions and paragraphs within an agreement. Identification and validation of contract clauses can be a time-consuming and challenging task demanding the services of trained and expensive lawyers, paralegals or other legal assistants. Classification of legal provisions in contracts using artificial intelligence and natural language processing is complex due to the requirement of domain-specialized legal language for model training and the scarcity of sufficient labeled data in the legal domain. Using general-purpose models is not effective in this context due to the use of specialized legal vocabulary in contracts which may not be recognized by a general model. To address this problem, we propose the use of a pre-trained large language model which is subsequently calibrated on legal taxonomy. We propose LegalPro-BERT, a BERT transformer architecture model that we fine- tune to efficiently handle classification task for legal provisions. We conducted experiments to measure and compare metrics with current benchmark results. We found that LegalPro-BERT outperforms the previous benchmark used for comparison in this research.

Abstract (translated)

合同是一种在组织中常用的法律文件类型。合同审查是避免业务风险和责任的重要和重复的过程。合同分析需要识别和分类协议中的关键条款和段落。识别和验证合同条款可能是一个耗时且具有挑战性的任务,需要训练有素且成本高昂的律师、法律助理或其他法律专业人士的帮助。使用人工智能和自然语言处理对合同法律条款进行分类由于对训练和标注数据的需求以及法律领域中可获得足够标注数据的稀缺性而变得复杂。使用通用模型在这种情况下并不有效,因为合同中使用的专业化法律词汇可能不会被通用模型所识别。为解决这个问题,我们提出了一个预训练的大型语言模型,然后在法律分类上进行微调。我们提出了LegalPro-BERT,是一个我们对其进行微调以高效处理法律条款分类任务的BERT变换器架构模型。我们进行了实验来衡量和比较与现有基准结果相关的指标。我们发现,LegalPro-BERT超越了本次研究中的比较基准。

URL

https://arxiv.org/abs/2404.10097

PDF

https://arxiv.org/pdf/2404.10097.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot