Abstract
Template mining is one of the foundational tasks to support log analysis, which supports the diagnosis and troubleshooting of large scale Web applications. This paper develops a human-in-the-loop template mining framework to support interactive log analysis, which is highly desirable in real-world diagnosis or troubleshooting of Web applications but yet previous template mining algorithms fails to support it. We formulate three types of light-weight user feedbacks and based on them we design three atomic human-in-the-loop template mining algorithms. We derive mild conditions under which the outputs of our proposed algorithms are provably correct. We also derive upper bounds on the computational complexity and query complexity of each algorithm. We demonstrate the versatility of our proposed algorithms by combining them to improve the template mining accuracy of five representative algorithms over sixteen widely used benchmark datasets.
Abstract (translated)
模板挖掘是支持日志分析的基础任务之一,支持大规模 Web 应用程序的诊断和治疗。本文开发了人类参与的模板挖掘框架,以支持交互式日志分析,这在实际的 Web 应用程序诊断或故障排除中非常渴望,但以往的模板挖掘算法却无法支持。我们制定了三种轻量级用户反馈,并基于它们设计了三种原子的人类参与的模板挖掘算法。我们推导了 mild 条件下,我们的提议算法的输出是确凿正确的条件。我们还推导了每个算法的计算复杂度和查询复杂度的upper bound。我们将它们组合在一起,以改进五名代表性算法在十六个广泛应用基准数据集上的模板挖掘准确性。我们展示了我们提议算法的多样性,通过将它们组合在一起,提高了五名代表性算法在十六个广泛应用基准数据集上的模板挖掘准确性。
URL
https://arxiv.org/abs/2301.12225