Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation

Abstract
Abstract (translated)
URL
PDF

Abstract

Molecule-and-text cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation, thereby improving performance in various scientific fields, including drug discovery and materials science. Existing studies adopt a global alignment approach to learn the knowledge from different modalities. These global alignment approaches fail to capture fine-grained information, such as molecular fragments and their corresponding textual description, which is crucial for downstream tasks. Furthermore, it is incapable to model such information using a similar global alignment strategy due to data scarcity of paired local part annotated data from existing datasets. In this paper, we propose Atomas, a multi-modal molecular representation learning framework to jointly learn representations from SMILES string and text. We design a Hierarchical Adaptive Alignment model to concurrently learn the fine-grained fragment correspondence between two modalities and align these representations of fragments in three levels. Additionally, Atomas's end-to-end training framework incorporates the tasks of understanding and generating molecule, thereby supporting a wider range of downstream tasks. In the retrieval task, Atomas exhibits robust generalization ability and outperforms the baseline by 30.8% of recall@1 on average. In the generation task, Atomas achieves state-of-the-art results in both molecule captioning task and molecule generation task. Moreover, the visualization of the Hierarchical Adaptive Alignment model further confirms the chemical significance of our approach. Our codes can be found at https://anonymous.4open.science/r/Atomas-03C3.

Abstract (translated)

分子和文本跨模态表示学习已成为提高分子表示质量的有前景的方向，从而在药物发现和材料科学等领域提高性能。现有研究采用全局对齐方法从不同模态中学习知识。然而，这些全局对齐方法无法捕捉到细粒度信息，例如分子片段及其相应的文本描述，这对下游任务至关重要。此外，由于现有数据集的配对局部部分注释数据较少，它无法使用类似的全局对齐策略来建模这些信息。在本文中，我们提出了Atomas，一种多模态分子表示学习框架，共同学习来自SMILES字符串和文本的表示。我们设计了一个等级适应性对齐模型，以同时学习两个模态中片段的细粒度对应关系，并将这些片段表示对齐到三个层次。此外，Atomas的端到端训练框架包括理解和解构分子的任务，从而支持更广泛的下游任务。在检索任务中，Atomas表现出稳健的泛化能力，平均比基线高30.8%的召回率。在生成任务中，Atomas在分子摘要任务和分子生成任务上实现最先进的结果。此外，层次结构适应性对齐模型的可视化进一步证实了我们的方法具有重要的化学意义。我们的代码可以在https://anonymous.4open.science/r/Atomas-03C3中找到。

URL

https://arxiv.org/abs/2404.16880

PDF

https://arxiv.org/pdf/2404.16880.pdf

Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation

Abstract

Abstract (translated)

URL

PDF Copy

PDF