Abstract
Context. Developing secure and reliable software remains a key challenge in software engineering (SE). The ever-evolving technological landscape offers both opportunities and threats, creating a dynamic space where chaos and order compete. Secure software engineering (SSE) must continuously address vulnerabilities that endanger software systems and carry broader socio-economic risks, such as compromising critical national infrastructure and causing significant financial losses. Researchers and practitioners have explored methodologies like Static Application Security Testing Tools (SASTTs) and artificial intelligence (AI) approaches, including machine learning (ML) and large language models (LLMs), to detect and mitigate these vulnerabilities. Each method has unique strengths and limitations. Aim. This thesis seeks to bring order to the chaos in SSE by addressing domain-specific differences that impact AI accuracy. Methodology. The research employs a mix of empirical strategies, such as evaluating effort-aware metrics, analyzing SASTTs, conducting method-level analysis, and leveraging evidence-based techniques like systematic dataset reviews. These approaches help characterize vulnerability prediction datasets. Results. Key findings include limitations in static analysis tools for identifying vulnerabilities, gaps in SASTT coverage of vulnerability types, weak relationships among vulnerability severity scores, improved defect prediction accuracy using just-in-time modeling, and threats posed by untouched methods. Conclusions. This thesis highlights the complexity of SSE and the importance of contextual knowledge in improving AI-driven vulnerability and defect prediction. The comprehensive analysis advances effective prediction models, benefiting both researchers and practitioners.
Abstract (translated)
翻译如下: 在软件工程(SE)中,开发安全且可靠的软件仍然是一个关键挑战。不断变化的技术环境既带来了机遇也带来了威胁,在这样的环境中,混沌与秩序相互竞争。安全软件工程(SSE)必须持续应对危及软件系统的漏洞,并承担更广泛的经济社会风险,如破坏关键国家基础设施和造成重大经济损失。研究人员和实践者已经探索了各种方法来检测和缓解这些漏洞,包括静态应用安全测试工具(SASTTs)以及人工智能(AI)方法,例如机器学习(ML)和大型语言模型(LLMs)。每种方法都有其独特的优点和局限性。 目的:本论文旨在通过解决影响AI准确性的领域特定差异问题来为SSE中的混乱带来秩序。 研究方法:该研究采用了一系列经验策略,包括评估意识努力的度量标准、分析静态应用安全测试工具(SASTTs)、进行方法级分析以及利用基于证据的技术如系统性数据集审查等手段。这些方法有助于对漏洞预测数据集进行描述。 结果:主要发现包括在识别漏洞方面静态分析工具的局限性,SASTT对于漏洞类型覆盖范围的缺口,关于漏洞严重程度评分之间弱相关性的现象,使用即时建模提高了缺陷预测精度以及未触及的方法所面临的风险等。 结论:本论文强调了SSE领域的复杂性,并指出了提升AI驱动的漏洞和缺陷预测效果的重要性。全面分析促进了有效的预测模型的发展,为研究人员和实践者带来了益处。
URL
https://arxiv.org/abs/2501.05165