Predicting Visit Cost of Obstructive Sleep Apnea using Electronic Healthcare Records with Transformer

Abstract
Abstract (translated)
URL
PDF

Abstract

Background: Obstructive sleep apnea (OSA) is growing increasingly prevalent in many countries as obesity rises. Sufficient, effective treatment of OSA entails high social and financial costs for healthcare. Objective: For treatment purposes, predicting OSA patients' visit expenses for the coming year is crucial. Reliable estimates enable healthcare decision-makers to perform careful fiscal management and budget well for effective distribution of resources to hospitals. The challenges created by scarcity of high-quality patient data are exacerbated by the fact that just a third of those data from OSA patients can be used to train analytics models: only OSA patients with more than 365 days of follow-up are relevant for predicting a year's expenditures. Methods and procedures: The authors propose a method applying two Transformer models, one for augmenting the input via data from shorter visit histories and the other predicting the costs by considering both the material thus enriched and cases with more than a year's follow-up. Results: The two-model solution permits putting the limited body of OSA patient data to productive use. Relative to a single-Transformer solution using only a third of the high-quality patient data, the solution with two models improved the prediction performance's $R^{2}$ from 88.8% to 97.5%. Even using baseline models with the model-augmented data improved the $R^{2}$ considerably, from 61.6% to 81.9%. Conclusion: The proposed method makes prediction with the most of the available high-quality data by carefully exploiting details, which are not directly relevant for answering the question of the next year's likely expenditure.

Abstract (translated)

背景:随着肥胖问题的日益增加,阻塞性睡眠呼吸暂停症(OSA)在许多国家变得越来越普遍。对于治疗目的而言,预测OSA患者明年的访问费用是至关重要的。可靠的估计使得医疗保健决策人员能够进行仔细的财务管理和资源的有效分配到医院。目标:对于治疗目的而言,预测OSA患者的访问费用是极其重要的。可靠的估计使得医疗保健决策人员能够进行仔细的财务管理和预算,以有效分配资源到医院。由于高质量患者数据稀缺,只有OSA患者的第三个数据可用于训练分析模型:只有超过365天的随访OSA患者对于预测一年的支出才具有相关性。方法:本文提出了一种方法,使用两个Transformer模型,一个用于增加输入数据通过缩短访问历史的数据,另一个用于预测费用,并考虑 both the material thus enriched 和超过一年的随访案例。结果:两个模型的解决方案使有限的OSA患者数据能够以生产性的方式利用。相对于仅使用高质量患者数据的三分之一的单个Transformer解决方案,两个模型的解决方案将预测性能 $R^{2}$ 从88.8%提高到了97.5%。即使使用与模型增强数据相关的基线模型,也将$R^{2}$显著提高,从61.6%提高到了81.9%。结论:本文提出了一种方法,通过仔细利用细节,利用可用的高质量数据中最大的部分进行预测。这种方法利用两个Transformer模型,将有限的OSA患者数据的潜力最大限度地发挥出来。

URL

https://arxiv.org/abs/2301.12289

PDF

https://arxiv.org/pdf/2301.12289.pdf