Paper Reading AI Learner

Retrosynthetic Planning with Dual Value Networks

2023-01-31 16:43:53
Guoqing Liu, Di Xue, Shufang Xie, Yingce Xia, Austin Tripp, Krzysztof Maziarz, Marwin Segler, Tao Qin, Zongzhang Zhang, Tie-Yan Liu

Abstract

Retrosynthesis, which aims to find a route to synthesize a target molecule from commercially available starting materials, is a critical task in drug discovery and materials design. Recently, the combination of ML-based single-step reaction predictors with multi-step planners has led to promising results. However, the single-step predictors are mostly trained offline to optimize the single-step accuracy, without considering complete routes. Here, we leverage reinforcement learning (RL) to improve the single-step predictor, by using a tree-shaped MDP to optimize complete routes while retaining single-step accuracy. Desirable routes should be both synthesizable and of low cost. We propose an online training algorithm, called Planning with Dual Value Networks (PDVN), in which two value networks predict the synthesizability and cost of molecules, respectively. To maintain the single-step accuracy, we design a two-branch network structure for the single-step predictor. On the widely-used USPTO dataset, our PDVN algorithm improves the search success rate of existing multi-step planners (e.g., increasing the success rate from 85.79% to 98.95% for Retro*, and reducing the number of model calls by half while solving 99.47% molecules for RetroGraph). Furthermore, PDVN finds shorter synthesis routes (e.g., reducing the average route length from 5.76 to 4.83 for Retro*, and from 5.63 to 4.78 for RetroGraph).

Abstract (translated)

Retrosynthesis 旨在从商业可用原材料中合成目标分子的 route,是药物发现和材料设计中的关键问题。最近,基于机器学习的一步反应预测器和多步规划器的结合取得了令人瞩目的结果。然而,一步反应预测器大多在离线状态下训练,以优化一步精度,而不考虑完整的路径。在这里,我们利用强化学习(RL)来提高一步反应预测器的性能,通过使用树形MDP优化完整的路径,同时保持一步精度。我们希望寻找既可以合成又可以低成本生产的路径。我们提出了一种在线训练算法,称为“ Planning with Dual Value Networks (PDVN)”,其中两个价值网络预测分子的合成性和成本。为了保持一步精度,我们为一步反应预测器设计了两个分支的网络结构。在我们广泛应用的USPTO数据集上,我们的PDVN算法提高了现有多步规划器的搜索成功率(例如, Retro*的成功率从85.79%增加到98.95%),同时减少了模型调用的数量,而 RetroGraph 解决的问题中分子的解决率从99.47%增加到99.4%。此外,PDVN 找到了更短的合成路径(例如, Retro*的平均路径长度从5.76降低到4.83, RetroGraph 的平均路径长度从5.63降低到4.78)。

URL

https://arxiv.org/abs/2301.13755

PDF

https://arxiv.org/pdf/2301.13755.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot