Paper Reading AI Learner

Frozen Language Model Helps ECG Zero-Shot Learning

2023-03-22 05:01:14
Jun Li, Che Liu, Sibo Cheng, Rossella Arcucci, Shenda Hong

Abstract

The electrocardiogram (ECG) is one of the most commonly used non-invasive, convenient medical monitoring tools that assist in the clinical diagnosis of heart diseases. Recently, deep learning (DL) techniques, particularly self-supervised learning (SSL), have demonstrated great potential in the classification of ECG. SSL pre-training has achieved competitive performance with only a small amount of annotated data after fine-tuning. However, current SSL methods rely on the availability of annotated data and are unable to predict labels not existing in fine-tuning datasets. To address this challenge, we propose Multimodal ECG-Text Self-supervised pre-training (METS), the first work to utilize the auto-generated clinical reports to guide ECG SSL pre-training. We use a trainable ECG encoder and a frozen language model to embed paired ECG and automatically machine-generated clinical reports separately. The SSL aims to maximize the similarity between paired ECG and auto-generated report while minimize the similarity between ECG and other reports. In downstream classification tasks, METS achieves around 10% improvement in performance without using any annotated data via zero-shot classification, compared to other supervised and SSL baselines that rely on annotated data. Furthermore, METS achieves the highest recall and F1 scores on the MIT-BIH dataset, despite MIT-BIH containing different classes of ECG compared to the pre-trained dataset. The extensive experiments have demonstrated the advantages of using ECG-Text multimodal self-supervised learning in terms of generalizability, effectiveness, and efficiency.

Abstract (translated)

心电图(ECG)是最常用的非侵入性、方便的医疗监测工具之一,协助心脏病的临床诊断。最近,深度学习(DL)技术,特别是自监督学习(SSL),在ECG分类方面表现出巨大的潜力。SSL pre-training通过微调后取得了与大量标记数据相关的 competitive 表现。然而,当前SSL方法依赖于标记数据的可用性,无法预测微调数据集上不存在的标签。为了解决这一挑战,我们提出了ECG文本modal自监督前训练(METS),是第一个利用自动生成的临床报告指导ECG SSL pre-training的工作。我们使用可训练ECG编码器和冻结语言模型,分别嵌入一对ECG和自动生成的临床报告。SSL的目标是最大化配对ECG和自动报告之间的相似性,同时最小化ECG和其他报告之间的相似性。在后续分类任务中,METS通过零样本分类取得了与依赖标记数据的其他自监督和SSL基准点相比约10%的性能改进,尽管MIT-BIH比训练数据集包含不同的ECG类别。此外,METS在MIT-BIH数据集上取得了最高的召回率和F1得分,尽管MIT-BIH相对于训练数据集包含不同的ECG类别。广泛的实验已经证明了使用ECG文本modal自监督学习在可移植性、效率和泛化性方面的优势。

URL

https://arxiv.org/abs/2303.12311

PDF

https://arxiv.org/pdf/2303.12311.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot