Paper Reading AI Learner

Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models

2024-04-23 09:05:09
Yang Tan, Mingchen Li, Bingxin Zhou, Bozitao Zhong, Lirong Zheng, Pan Tan, Ziyi Zhou, Huiqun Yu, Guisheng Fan, Liang Hong

Abstract

Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs. However, the direct transfer to life science tasks is non-trivial due to the different training strategies and data forms. To address this gap, we introduce SES-Adapter, a simple, efficient, and scalable adapter method for enhancing the representation learning of PLMs. SES-Adapter incorporates PLM embeddings with structural sequence embeddings to create structure-aware representations. We show that the proposed method is compatible with different PLM architectures and across diverse tasks. Extensive evaluations are conducted on 2 types of folding structures with notable quality differences, 9 state-of-the-art baselines, and 9 benchmark datasets across distinct downstream tasks. Results show that compared to vanilla PLMs, SES-Adapter improves downstream task performance by a maximum of 11% and an average of 3%, with significantly accelerated training speed by a maximum of 1034% and an average of 362%, the convergence rate is also improved by approximately 2 times. Moreover, positive optimization is observed even with low-quality predicted structures. The source code for SES-Adapter is available at this https URL.

Abstract (translated)

预训练的蛋白质语言模型(PLMs)的微调被证明是一种增强下游预测任务的突出策略,往往比传统监督学习方法更优异。作为一种在自然语言处理中广泛应用的强大的技术,采用参数高效的微调方法可能会提高PLMs的性能。然而,由于训练策略和数据形式的不同,将微调应用于生物学任务并不是一件容易的事情。为了填补这一空白,我们引入了SES-Adapter,一种简单、高效、可扩展的适配器方法,用于增强PLMs的表示学习。SES-Adapter通过将PLM嵌入与结构序列嵌入相结合来创建结构意识表示。我们证明了所提出的方法可以兼容不同PLM架构,并在多样任务上取得良好的效果。在两个类型的折叠结构上进行了广泛的评估,包括显著的质量差异的两种PLM架构、9个最先进的基准和9个基准数据集。结果表明,与普通PLM相比,SES-Adapter通过提高下游任务性能最多11%,平均3%,以及通过最大1034%和平均362%的加速训练速度,显著改善了训练速度。此外,即使在低质量的预测结构上,也观察到了积极的优化。SES-Adapter的源代码可在此处访问:https://url.com/

URL

https://arxiv.org/abs/2404.14850

PDF

https://arxiv.org/pdf/2404.14850.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot