Paper Reading AI Learner

LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing

2024-04-27 20:34:29
Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-Hsun Chen, Shaowei Wang

Abstract

Logs are important in modern software development with runtime information. Log parsing is the first step in many log-based analyses, that involve extracting structured information from unstructured log data. Traditional log parsers face challenges in accurately parsing logs due to the diversity of log formats, which directly impacts the performance of downstream log-analysis tasks. In this paper, we explore the potential of using Large Language Models (LLMs) for log parsing and propose LLMParser, an LLM-based log parser based on generative LLMs and few-shot tuning. We leverage four LLMs, Flan-T5-small, Flan-T5-base, LLaMA-7B, and ChatGLM-6B in LLMParsers. Our evaluation of 16 open-source systems shows that LLMParser achieves statistically significantly higher parsing accuracy than state-of-the-art parsers (a 96% average parsing accuracy). We further conduct a comprehensive empirical analysis on the effect of training size, model size, and pre-training LLM on log parsing accuracy. We find that smaller LLMs may be more effective than more complex LLMs; for instance where Flan-T5-base achieves comparable results as LLaMA-7B with a shorter inference time. We also find that using LLMs pre-trained using logs from other systems does not always improve parsing accuracy. While using pre-trained Flan-T5-base shows an improvement in accuracy, pre-trained LLaMA results in a decrease (decrease by almost 55% in group accuracy). In short, our study provides empirical evidence for using LLMs for log parsing and highlights the limitations and future research direction of LLM-based log parsers.

Abstract (translated)

日志在现代软件开发中非常重要,因为它包含了运行时信息。日志解析是许多基于日志的分析的第一步,涉及从无结构日志数据中提取有结构信息的任务。由于日志格式的多样性,传统的日志解析器在准确解析日志方面面临挑战,这直接影响了下游日志分析任务的性能。在本文中,我们探讨了使用大型语言模型(LLMs)进行日志解析的潜力,并提出了基于生成LLM的LLMParser,一种基于LLM的日志解析器,以及基于极少样本调整的LLMParser。我们使用了四种LLM,Flan-T5-small,Flan-T5-base,LLaMA-7B和ChatGLM-6B在LLMParsers中。我们对16个开源系统的评估结果表明,LLMParser具有统计学上显著更高的解析准确性(平均解析准确性为96%)。我们还对训练大小、模型大小和预训练LLM对日志解析准确性的影响进行了全面的实证分析。我们发现,较小的LLM可能比较大的LLM更有效;例如,当Flan-T5-base的推理时间与LLaMA-7B相当时。我们还发现,使用来自其他系统的日志训练LLM并不总是能提高解析准确性。虽然使用预训练的Flan-T5-base在准确性上有所提高,但预训练的LLaMA结果却导致了几乎55%的准确率下降(组准确率下降近55%)。总之,我们的研究提供了使用LLMs进行日志解析的实证证据,并突出了LLM-based log parser的局限性和未来研究的方向。

URL

https://arxiv.org/abs/2404.18001

PDF

https://arxiv.org/pdf/2404.18001.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot