Paper Reading AI Learner

Retrieval-Enhanced Mutation Mastery: Augmenting Zero-Shot Prediction of Protein Language Model

2024-10-28 15:28:51
Yang Tan, Ruilin Wang, Banghao Wu, Liang Hong, Bingxin Zhou

Abstract

Enzyme engineering enables the modification of wild-type proteins to meet industrial and research demands by enhancing catalytic activity, stability, binding affinities, and other properties. The emergence of deep learning methods for protein modeling has demonstrated superior results at lower costs compared to traditional approaches such as directed evolution and rational design. In mutation effect prediction, the key to pre-training deep learning models lies in accurately interpreting the complex relationships among protein sequence, structure, and function. This study introduces a retrieval-enhanced protein language model for comprehensive analysis of native properties from sequence and local structural interactions, as well as evolutionary properties from retrieved homologous sequences. The state-of-the-art performance of the proposed ProtREM is validated on over 2 million mutants across 217 assays from an open benchmark (ProteinGym). We also conducted post-hoc analyses of the model's ability to improve the stability and binding affinity of a VHH antibody. Additionally, we designed 10 new mutants on a DNA polymerase and conducted wet-lab experiments to evaluate their enhanced activity at higher temperatures. Both in silico and experimental evaluations confirmed that our method provides reliable predictions of mutation effects, offering an auxiliary tool for biologists aiming to evolve existing enzymes. The implementation is publicly available at this https URL.

Abstract (translated)

酶工程通过增强催化活性、稳定性、结合亲和力和其他特性,使野生型蛋白质得以改造以满足工业和研究需求。深度学习方法在蛋白质建模中的出现已经证明,在成本较低的情况下相比传统方法(如定向进化和理性设计)取得了更优的结果。在预测突变效应方面,预训练深度学习模型的关键在于准确解读蛋白质序列、结构与功能之间的复杂关系。这项研究介绍了一种增强检索的蛋白质语言模型,用于全面分析从序列和局部结构相互作用中得出的固有特性,以及从检索到的同源序列中得出的进化特性。我们提出的ProtREM在超过200万个来自公开基准(ProteinGym)的突变体中进行了测试,并展示了其领先性能。我们也对模型提高VHH抗体稳定性和结合亲和力的能力进行了事后分析。此外,我们在一种DNA聚合酶上设计了10种新突变,并通过湿实验室实验评估它们在较高温度下的增强活性。无论是基于计算机模拟还是实验评价,都证实了我们的方法能够提供可靠的突变效应预测,为旨在进化现有酶的生物学家提供了辅助工具。该实现代码公开可用,请访问此链接:[https URL]。

URL

https://arxiv.org/abs/2410.21127

PDF

https://arxiv.org/pdf/2410.21127.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot