Paper Reading AI Learner

Semantic-aware Contrastive Learning for Electroencephalography-to-Text Generation with Curriculum Learning

2023-01-23 00:54:48
Xiachong Feng, Xiaocheng Feng, Bing Qin

Abstract

Electroencephalography-to-Text generation (EEG-to-Text), which aims to directly generate natural text from EEG signals has drawn increasing attention in recent years due to the enormous potential for Brain-computer interfaces (BCIs). However, the remarkable discrepancy between the subject-dependent EEG representation and the semantic-dependent text representation poses a great challenge to this task. To mitigate this challenge, we devise a Curriculum Semantic-aware Contrastive Learning strategy (C-SCL), which effectively re-calibrates the subject-dependent EEG representation to the semantic-dependent EEG representation, thus reducing the discrepancy. Specifically, our C-SCL pulls semantically similar EEG representations together while pushing apart dissimilar ones. Besides, in order to introduce more meaningful contrastive pairs, we carefully employ curriculum learning to not only craft meaningful contrastive pairs but also make the learning progressively. We conduct extensive experiments on the ZuCo benchmark and our method combined with diverse models and architectures shows stable improvements across three types of metrics while achieving the new state-of-the-art. Further investigation proves not only its superiority in both the single-subject and low-resource settings but also its robust generalizability in the zero-shot setting.

Abstract (translated)

电泳脑电图生成(EEG-to-Text)旨在从EEG信号中直接生成自然文本,近年来因为脑机接口(BCI)的巨大潜力而引起了越来越多的关注。然而,主语依赖EEG表示和语义依赖文本表示之间的显著差异对该任务构成了一个巨大的挑战。为了减轻这个挑战,我们设计了一项 curriculum 语义 aware Contrastive Learning 策略(C-SCL),该策略有效地将主语依赖EEG表示转换为语义依赖EEG表示,从而减少了差异。具体来说,我们的 C-SCL 将语义相似的EEG表示聚合在一起,同时将不相似的表示分离。此外,为了引入更多有意义的对比对,我们 carefully运用了课程学习,不仅创建了有意义的对比对,而且逐步推进学习。我们在 ZuCo 基准测试中进行了大量实验,我们的方法与多种模型和架构组合在一起,表现出稳定的改进,同时实现了新的先进技术。进一步研究证明,不仅在单个主语和资源匮乏的环境下,该方法优越于其他方法,而且在零样本环境下,其鲁棒的泛化能力也非常出色。

URL

https://arxiv.org/abs/2301.09237

PDF

https://arxiv.org/pdf/2301.09237.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot