Paper Reading AI Learner

Deep Learning for Video-Based Assessment of Endotracheal Intubation Skills

2024-04-17 20:28:15
Jean-Paul Ainam, Erim Yanik, Rahul Rahul, Taylor Kunkes, Lora Cavuoto, Brian Clemency, Kaori Tanaka, Matthew Hackett, Jack Norfleet, Suvranu De

Abstract

Endotracheal intubation (ETI) is an emergency procedure performed in civilian and combat casualty care settings to establish an airway. Objective and automated assessment of ETI skills is essential for the training and certification of healthcare providers. However, the current approach is based on manual feedback by an expert, which is subjective, time- and resource-intensive, and is prone to poor inter-rater reliability and halo effects. This work proposes a framework to evaluate ETI skills using single and multi-view videos. The framework consists of two stages. First, a 2D convolutional autoencoder (AE) and a pre-trained self-supervision network extract features from videos. Second, a 1D convolutional enhanced with a cross-view attention module takes the features from the AE as input and outputs predictions for skill evaluation. The ETI datasets were collected in two phases. In the first phase, ETI is performed by two subject cohorts: Experts and Novices. In the second phase, novice subjects perform ETI under time pressure, and the outcome is either Successful or Unsuccessful. A third dataset of videos from a single head-mounted camera for Experts and Novices is also analyzed. The study achieved an accuracy of 100% in identifying Expert/Novice trials in the initial phase. In the second phase, the model showed 85% accuracy in classifying Successful/Unsuccessful procedures. Using head-mounted cameras alone, the model showed a 96% accuracy on Expert and Novice classification while maintaining an accuracy of 85% on classifying successful and unsuccessful. In addition, GradCAMs are presented to explain the differences between Expert and Novice behavior and Successful and Unsuccessful trials. The approach offers a reliable and objective method for automated assessment of ETI skills.

Abstract (translated)

内窥镜引导(ETI)是一种在民事和军事医疗设施中进行的紧急措施,旨在建立气道。ETI技能的客观和自动评估对于医疗保健提供者的培训和认证至关重要。然而,目前的做法基于专家手动反馈,这是主观、时间和资源密集的,并且容易产生评分者一致性和晕轮效应。本研究提出了使用单视和多视角视频评估ETI技能的框架。该框架包括两个阶段。第一阶段,2D卷积自动编码器(AE)和预训练的自监督网络从视频中提取特征。第二阶段,一个具有跨视注意力模块的1D卷积将AE的特征作为输入并输出技能评估的预测。ETI数据集分为两个阶段收集。在第一阶段,由专家和新手组成的两个受试者组执行ETI。在第二阶段,新手受试者在时间压力下执行ETI,结果是成功或失败。此外,还对专家和新手使用单个头戴式摄像机收集的视频数据进行了分析。研究在初始阶段实现了100%的准确率来识别专家/新手试验。在第二阶段,模型在分类成功/失败程序方面展示了85%的准确性。使用仅头部佩戴式摄像机,模型在专家和新手分类上的准确率分别为96%,而分类成功和失败时的准确率分别为85%。此外,还提出了GradCAM,用于解释专家和新手行为以及成功/失败试验之间的差异。该方法提供了一种可靠且客观的自动评估ETI技能的方法。

URL

https://arxiv.org/abs/2404.11727

PDF

https://arxiv.org/pdf/2404.11727.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot