Paper Reading AI Learner

EPEE: Towards Efficient and Effective Foundation Models in Biomedicine

2025-03-03 21:11:13
Zaifu Zhan, Shuang Zhou, Huixue Zhou, Zirui Liu, Rui Zhang

Abstract

Foundation models, including language models, e.g., GPT, and vision models, e.g., CLIP, have significantly advanced numerous biomedical tasks. Despite these advancements, the high inference latency and the "overthinking" issues in model inference impair the efficiency and effectiveness of foundation models, thus limiting their application in real-time clinical settings. To address these challenges, we proposed EPEE (Entropy- and Patience-based Early Exiting), a novel hybrid strategy designed to improve the inference efficiency of foundation models. The core idea was to leverage the strengths of entropy-based and patience-based early exiting methods to overcome their respective weaknesses. To evaluate EPEE, we conducted experiments on three core biomedical tasks-classification, relation extraction, and event extraction-using four foundation models (BERT, ALBERT, GPT-2, and ViT) across twelve datasets, including clinical notes and medical images. The results showed that EPEE significantly reduced inference time while maintaining or improving accuracy, demonstrating its adaptability to diverse datasets and tasks. EPEE addressed critical barriers to deploying foundation models in healthcare by balancing efficiency and effectiveness. It potentially provided a practical solution for real-time clinical decision-making with foundation models, supporting reliable and efficient workflows.

Abstract (translated)

基础模型(包括语言模型,如GPT,和视觉模型,如CLIP)在众多生物医学任务中取得了显著进展。尽管这些进步已经实现,但在模型推断过程中存在的高延迟问题以及“过度思考”现象仍然阻碍了基础模型的效率与效果,从而限制其在临床实时应用场景中的应用。为了应对这些挑战,我们提出了EPEE(基于熵和耐心的早期退出策略),这是一种旨在提高基础模型推理效率的新颖混合策略。该方法的核心思想是利用熵为基础和以耐心为基础的早期退出方法的优点来克服各自的弱点。 为了评估EPEE的效果,我们在三个核心生物医学任务——分类、关系抽取和事件抽取上进行了实验,并使用了四种基础模型(BERT、ALBERT、GPT-2 和 ViT)在十二个数据集(包括临床记录和医学影像)上进行测试。结果表明,EPEE显著缩短了推理时间,同时保持甚至提升了准确性,展示了其适应多样数据集与任务的能力。 通过平衡效率与效果,EPEE解决了部署基础模型到医疗保健中的关键障碍,并为实时临床决策中应用基础模型提供了潜在的实际解决方案,支持可靠的高效工作流程。

URL

https://arxiv.org/abs/2503.02053

PDF

https://arxiv.org/pdf/2503.02053.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot