Abstract
Wearable EEG devices have emerged as a promising alternative to polysomnography (PSG). As affordable and scalable solutions, their widespread adoption results in the collection of massive volumes of unlabeled data that cannot be analyzed by clinicians at scale. Meanwhile, the recent success of deep learning for sleep scoring has relied on large annotated datasets. Self-supervised learning (SSL) offers an opportunity to bridge this gap, leveraging unlabeled signals to address label scarcity and reduce annotation effort. In this paper, we present the first systematic evaluation of SSL for sleep staging using wearable EEG. We investigate a range of well-established SSL methods and evaluate them on two sleep databases acquired with the Ikon Sleep wearable EEG headband: BOAS, a high-quality benchmark containing PSG and wearable EEG recordings with consensus labels, and HOGAR, a large collection of home-based, self-recorded, and unlabeled recordings. Three evaluation scenarios are defined to study label efficiency, representation quality, and cross-dataset generalization. Results show that SSL consistently improves classification performance by up to 10% over supervised baselines, with gains particularly evident when labeled data is scarce. SSL achieves clinical-grade accuracy above 80% leveraging only 5% to 10% of labeled data, while the supervised approach requires twice the labels. Additionally, SSL representations prove robust to variations in population characteristics, recording environments, and signal quality. Our findings demonstrate the potential of SSL to enable label-efficient sleep staging with wearable EEG, reducing reliance on manual annotations and advancing the development of affordable sleep monitoring systems.
Abstract (translated)
可穿戴EEG设备作为多导睡眠图(PSG)的一种有前途的替代方案已经出现。作为一种成本效益高且易于扩展的解决方案,它们的大规模采用导致了大量的未标记数据积累,这些数据无法通过临床医生进行大规模分析。与此同时,深度学习在睡眠评分方面的近期成功主要依赖于大型标注数据集。自监督学习(SSL)提供了一个机会来弥合这一差距,利用未标注信号解决标签稀缺问题并减少标注工作量。 本文首次对可穿戴EEG设备用于睡眠分期的自监督学习方法进行了系统性评估。我们研究了一系列成熟的自监督学习方法,并在两个使用Ikon Sleep可穿戴EEG头带获得的睡眠数据库上对其进行了评价:BOAS,一个包含PSG和可穿戴EEG记录及共识标签的高质量基准库;以及HOGAR,一个庞大的家庭环境下的自我录制且未标注的数据集。我们定义了三个评估场景来研究标签效率、表示质量以及跨数据集泛化能力。 结果显示,自监督学习方法始终比传统的监督基线提高了高达10%的分类性能,在标记数据稀缺时尤其明显。仅使用5%-10%的标注数据情况下,自监督学习就能达到超过80%的临床级准确率,而传统监督式方法则需要两倍数量的标签才能实现同样的效果。此外,自监督表示显示出对人口特征变化、记录环境和信号质量差异的强大鲁棒性。 我们的研究结果表明了SSL在利用可穿戴EEG设备进行睡眠分期方面具有潜力,可以减少对手动标注的依赖,并推动经济实惠的睡眠监测系统的发展。
URL
https://arxiv.org/abs/2510.07960