Paper Reading AI Learner

Fine-tuning ClimateBert transformer with ClimaText for the disclosure analysis of climate-related financial risks

2023-03-21 07:25:36
Eduardo C. Garrido-Merchán, Cristina González-Barthe, María Coronado Vaca

Abstract

In recent years there has been a growing demand from financial agents, especially from particular and institutional investors, for companies to report on climate-related financial risks. A vast amount of information, in text format, can be expected to be disclosed in the short term by firms in order to identify these types of risks in their financial and non financial reports, particularly in response to the growing regulation that is being passed on the matter. To this end, this paper applies state-of-the-art NLP techniques to achieve the detection of climate change in text corpora. We use transfer learning to fine-tune two transformer models, BERT and ClimateBert -a recently published DistillRoBERTa-based model that has been specifically tailored for climate text classification-. These two algorithms are based on the transformer architecture which enables learning the contextual relationships between words in a text. We carry out the fine-tuning process of both models on the novel Clima-Text database, consisting of data collected from Wikipedia, 10K Files Reports and web-based claims. Our text classification model obtained from the ClimateBert fine-tuning process on ClimaText, outperforms the models created with BERT and the current state-of-the-art transformer in this particular problem. Our study is the first one to implement on the ClimaText database the recently published ClimateBert algorithm. Based on our results, it can be said that ClimateBert fine-tuned on ClimaText is an outstanding tool within the NLP pre-trained transformer models that may and should be used by investors, institutional agents and companies themselves to monitor the disclosure of climate risk in financial reports. In addition, our transfer learning methodology is cheap in computational terms, thus allowing any organization to perform it.

Abstract (translated)

近年来,金融代理人,特别是特定和机构投资者,对公司报告与气候相关的金融风险日益需求。大量信息以文本格式 expected to be披露短期以识别公司的财务和非财务报告中的这种类型的风险,特别是针对正在通过不断增加的监管。为此,本文应用最先进的自然语言处理技术来实现在文本数据集上的气候变化检测。我们使用迁移学习微调了两个Transformer模型,BERT和 ClimateBert - 最近发布的基于DistillRoBERTa的模型,专门设计为气候文本分类-。这两个算法基于Transformer架构,使学习文本中的单词之间的上下文关系成为可能。我们在名为Clima-Text的数据集上执行了这两个模型的微调过程,该数据集包括从维基百科、10K文件报告和网上声称收集的数据。从 ClimateBert 在ClimaText 微调过程中获得的诗歌分类模型在这个问题中的特定问题上表现更好。我们的研究是首个在 ClimaText 数据库上实施最近发布的 ClimateBert 算法的研究。根据我们的结果,可以说 ClimateBert 在 ClimaText 微调过程中是NLP预训练Transformer模型中的优秀工具,可能应该和必须由投资者、机构代理人和公司自己用于监测在财务报告中披露气候风险。此外,我们的迁移学习方法在计算上是廉价的,因此允许任何组织执行。

URL

https://arxiv.org/abs/2303.13373

PDF

https://arxiv.org/pdf/2303.13373.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot