Paper Reading AI Learner

who is snoring? snore based user recognition

2023-01-28 14:28:57
Shenghao Li, Jagmohan Chauhan

Abstract

Snoring is one of the most prominent symptoms of Obstructive Sleep Apnea-Hypopnea Syndrome (OSAH), a highly prevalent disease that causes repetitive collapse and cessation of the upper airway. Thus, accurate snore sound monitoring and analysis is crucial. However, the traditional monitoring method polysomnography (PSG) requires the patients to stay at a sleep clinic for the whole night and be connected to many pieces of equipment. An alternative and less invasive way is passive monitoring using a smartphone at home or in the clinical settings. But, there is a challenge: the environment may be shared by people such that the raw audio may contain the snore activities of the bed partner or other person. False capturing of the snoring activity could lead to critical false alarms and misdiagnosis of the patients. To address this limitation, we propose a hypothesis that snore sound contains unique identity information which can be used for user recognition. We analyzed various machine learning models: Gaussian Mixture Model (GMM), GMM-UBM (Universial Background Model), and a Deep Neural Network (DNN) on MPSSC - an open source snoring dataset to evaluate the validity of our hypothesis. Our results are promising as we achieved around 90% accuracy in identification and verification tasks. This work marks the first step towards understanding the practicality of snore based user monitoring to enable multiple healthcare applicaitons.

Abstract (translated)

打鼾是阻塞性睡眠呼吸暂停综合症(OSAH)的典型症状之一,这是一种高度流行的疾病,会导致反复窒息和上呼吸道停止。因此,准确监测和分析打鼾声是至关重要的。然而,传统的脑电图(脑电图)监测方法需要患者在整个晚上住在睡眠诊所里,并连接到许多设备。一种替代而且不侵入性的方式是在家里或临床环境中使用智能手机进行的被动监测。但是,有一个挑战:环境可能由不同的人共享,因此原始音频可能包含床上伴侣或其他人的打鼾活动。错误地捕获打鼾活动可能会导致关键的错误警报和误诊。为了解决这个问题,我们提出了一个假设,即打鼾声包含独特的身份信息,可以用于用户识别。我们分析了各种机器学习模型:高斯混合模型(GMM)、高斯混合模型-背景模型(GMM-UBM)、以及深度神经网络(DNN)在一个开放源代码打鼾数据集MPSSC上评估我们的假设的有效性。我们的结果显示很有前景,我们在识别和验证任务中实现了约90%的准确率。这项工作标志着理解基于打鼾的用户监测的实际ities的第一步,以支持多种医疗保健应用。

URL

https://arxiv.org/abs/2301.12209

PDF

https://arxiv.org/pdf/2301.12209.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot