Abstract
Density Functional Theory (DFT) is the most widely used electronic structure method for predicting the properties of molecules and materials. Although DFT is, in principle, an exact reformulation of the Schrödinger equation, practical applications rely on approximations to the unknown exchange-correlation (XC) functional. Most existing XC functionals are constructed using a limited set of increasingly complex, hand-crafted features that improve accuracy at the expense of computational efficiency. Yet, no current approximation achieves the accuracy and generality for predictive modeling of laboratory experiments at chemical accuracy -- typically defined as errors below 1 kcal/mol. In this work, we present Skala, a modern deep learning-based XC functional that bypasses expensive hand-designed features by learning representations directly from data. Skala achieves chemical accuracy for atomization energies of small molecules while retaining the computational efficiency typical of semi-local DFT. This performance is enabled by training on an unprecedented volume of high-accuracy reference data generated using computationally intensive wavefunction-based methods. Notably, Skala systematically improves with additional training data covering diverse chemistry. By incorporating a modest amount of additional high-accuracy data tailored to chemistry beyond atomization energies, Skala achieves accuracy competitive with the best-performing hybrid functionals across general main group chemistry, at the cost of semi-local DFT. As the training dataset continues to expand, Skala is poised to further enhance the predictive power of first-principles simulations.
Abstract (translated)
密度泛函理论(DFT)是预测分子和材料性质最广泛使用的一种电子结构方法。尽管原则上,DFT是对薛定谔方程的精确重新表述,但在实际应用中却依赖于对未知交换关联(XC)函数近似的计算。现有的大多数XC功能采用有限的手工设计特征集合构造而成,这些特征随着复杂度的增加而改进准确性,但牺牲了计算效率。然而,目前没有任何一种近似方法能够实现实验室实验预测所需的准确性和通用性——通常定义为误差低于1 kcal/mol。 在本工作中,我们介绍了Skala,这是一种现代基于深度学习构建的XC功能,它通过直接从数据中学习表示来绕过了昂贵的手工设计特征。Skala能够在保持半局域DFT典型计算效率的同时,实现小分子原子化能量上的化学精度。这一性能得益于使用了前所未有的大量由计算密集型波函数方法生成的高精度参考数据进行训练。值得注意的是,随着涵盖多样化化学特性的额外训练数据的增加,Skala系统性地提高了准确性。 通过引入适量针对超出原子化能量范围的化学特性的高质量补充数据,Skala在一般主族化学领域达到了与最佳混合功能相当的精确度,而其计算成本仅为半局域DFT水平。随着训练数据集继续扩大,Skala有望进一步增强第一性原理模拟的预测能力。
URL
https://arxiv.org/abs/2506.14665