Abstract
Chirality is a fundamental molecular property that governs stereospecific behavior in chemistry and biology. Capturing chirality in machine learning models remains challenging due to the geometric complexity of stereochemical relationships and the limitations of traditional molecular representations that often lack explicit stereochemical encoding. Existing approaches to chiral molecular representation primarily focus on central chirality, relying on handcrafted stereochemical tags or limited 3D encodings, and thus fail to generalize to more complex forms such as axial chirality. In this work, we introduce ChiDeK (Chiral Determinant Kernels), a framework that systematically integrates stereogenic information into molecular representation learning. We propose the chiral determinant kernel to encode the SE(3)-invariant chirality matrix and employ cross-attention to integrate stereochemical information from local chiral centers into the global molecular representation. This design enables explicit modeling of chiral-related features within a unified architecture, capable of jointly encoding central and axial chirality. To support the evaluation of axial chirality, we construct a new benchmark for electronic circular dichroism (ECD) and optical rotation (OR) prediction. Across four tasks, including R/S configuration classification, enantiomer ranking, ECD spectrum prediction, and OR prediction, ChiDeK achieves substantial improvements over state-of-the-art baselines, most notably yielding over 7% higher accuracy on axially chiral tasks on average.
Abstract (translated)
手性是化学和生物学中分子的基本属性,它决定了立体特异性的行为。在机器学习模型中捕捉手性仍然具有挑战性,因为立体化学关系的几何复杂性和传统分子表示方法通常缺乏明确的手性编码限制了这一过程。现有的手性分子表征方法主要集中在中心手性上,依赖于人工设计的手性标签或有限的三维编码,因此无法推广到如轴向手性等更复杂的形态。在这项工作中,我们介绍了ChiDeK(手性决定核),这是一个系统地将立体生成信息整合进分子表示学习框架中的模型。我们提出了手性决定核来编码SE(3)不变的手性矩阵,并采用交叉注意力机制从局部手性中心集成立体化学信息到全局分子表征中。这种设计能够在统一的架构内明确建模与手性相关的特征,能够同时编码中心和轴向手性。 为了支持轴向手性的评估,我们构建了一个新的电子圆二色谱(ECD)和光学旋转(OR)预测基准测试集。在包括R/S构型分类、对映体排名、ECD光谱预测和OR预测在内的四个任务上,ChiDeK相比最先进的基线模型取得了显著的改进,在轴向手性相关任务上的准确率平均提高了超过7%。
URL
https://arxiv.org/abs/2602.07415