LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing

Abstract
Abstract (translated)
URL
PDF

Abstract

Logs are important in modern software development with runtime information. Log parsing is the first step in many log-based analyses, that involve extracting structured information from unstructured log data. Traditional log parsers face challenges in accurately parsing logs due to the diversity of log formats, which directly impacts the performance of downstream log-analysis tasks. In this paper, we explore the potential of using Large Language Models (LLMs) for log parsing and propose LLMParser, an LLM-based log parser based on generative LLMs and few-shot tuning. We leverage four LLMs, Flan-T5-small, Flan-T5-base, LLaMA-7B, and ChatGLM-6B in LLMParsers. Our evaluation of 16 open-source systems shows that LLMParser achieves statistically significantly higher parsing accuracy than state-of-the-art parsers (a 96% average parsing accuracy). We further conduct a comprehensive empirical analysis on the effect of training size, model size, and pre-training LLM on log parsing accuracy. We find that smaller LLMs may be more effective than more complex LLMs; for instance where Flan-T5-base achieves comparable results as LLaMA-7B with a shorter inference time. We also find that using LLMs pre-trained using logs from other systems does not always improve parsing accuracy. While using pre-trained Flan-T5-base shows an improvement in accuracy, pre-trained LLaMA results in a decrease (decrease by almost 55% in group accuracy). In short, our study provides empirical evidence for using LLMs for log parsing and highlights the limitations and future research direction of LLM-based log parsers.

Abstract (translated)

日志在现代软件开发中非常重要，因为它包含了运行时信息。日志解析是许多基于日志的分析的第一步，涉及从无结构日志数据中提取有结构信息的任务。由于日志格式的多样性，传统的日志解析器在准确解析日志方面面临挑战，这直接影响了下游日志分析任务的性能。在本文中，我们探讨了使用大型语言模型（LLMs）进行日志解析的潜力，并提出了基于生成LLM的LLMParser，一种基于LLM的日志解析器，以及基于极少样本调整的LLMParser。我们使用了四种LLM，Flan-T5-small，Flan-T5-base，LLaMA-7B和ChatGLM-6B在LLMParsers中。我们对16个开源系统的评估结果表明，LLMParser具有统计学上显著更高的解析准确性（平均解析准确性为96%）。我们还对训练大小、模型大小和预训练LLM对日志解析准确性的影响进行了全面的实证分析。我们发现，较小的LLM可能比较大的LLM更有效；例如，当Flan-T5-base的推理时间与LLaMA-7B相当时。我们还发现，使用来自其他系统的日志训练LLM并不总是能提高解析准确性。虽然使用预训练的Flan-T5-base在准确性上有所提高，但预训练的LLaMA结果却导致了几乎55%的准确率下降（组准确率下降近55%）。总之，我们的研究提供了使用LLMs进行日志解析的实证证据，并突出了LLM-based log parser的局限性和未来研究的方向。

URL

https://arxiv.org/abs/2404.18001

PDF

https://arxiv.org/pdf/2404.18001.pdf

LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing

Abstract

Abstract (translated)

URL

PDF Copy

PDF