Paper Reading AI Learner

Semantic segmentation of surgical hyperspectral images under geometric domain shifts

2023-03-20 09:50:07
Jan Sellner, Silvia Seidlitz, Alexander Studier-Fischer, Alessandro Motta, Berkin Özdemir, Beat Peter Müller-Stich, Felix Nickel, Lena Maier-Hein

Abstract

Robust semantic segmentation of intraoperative image data could pave the way for automatic surgical scene understanding and autonomous robotic surgery. Geometric domain shifts, however, although common in real-world open surgeries due to variations in surgical procedures or situs occlusions, remain a topic largely unaddressed in the field. To address this gap in the literature, we (1) present the first analysis of state-of-the-art (SOA) semantic segmentation networks in the presence of geometric out-of-distribution (OOD) data, and (2) address generalizability with a dedicated augmentation technique termed "Organ Transplantation" that we adapted from the general computer vision community. According to a comprehensive validation on six different OOD data sets comprising 600 RGB and hyperspectral imaging (HSI) cubes from 33 pigs semantically annotated with 19 classes, we demonstrate a large performance drop of SOA organ segmentation networks applied to geometric OOD data. Surprisingly, this holds true not only for conventional RGB data (drop of Dice similarity coefficient (DSC) by 46 %) but also for HSI data (drop by 45 %), despite the latter's rich information content per pixel. Using our augmentation scheme improves on the SOA DSC by up to 67 % (RGB) and 90 % (HSI) and renders performance on par with in-distribution performance on real OOD test data. The simplicity and effectiveness of our augmentation scheme makes it a valuable network-independent tool for addressing geometric domain shifts in semantic scene segmentation of intraoperative data. Our code and pre-trained models will be made publicly available.

Abstract (translated)

在 intraoperative 图像数据上进行 robust semantic segmentation 可以开辟自动 surgical 场景理解和机器人手术的路径。然而,由于手术操作或装置位置的变化,实际开放式手术中常常出现几何域转移,这是一个在学术界尚未解决的话题。为了填补文献中的这一空缺,我们(1)将当前最先进的(SOA)语义分割网络在存在几何分布差异数据的情况下进行初步分析,(2)使用一种名为“器官移植”的专门增强技术,从通用计算机视觉社区中借用。根据对六个不同的 OOD 数据集的全面的验证,其中包括 600 个 RGB 和超光谱成像(HSI)立方体,从 33 只猪语义标注了 19 个类别的数据,我们展示了 SOA 器官分割网络对几何 OOD 数据的应用表现出很大性能下降。令人惊奇的是,不仅对传统 RGB 数据(DSC 下降 46%),而且对 HSI 数据(下降 45%),尽管 HSI 像素的信息密度更高。使用我们的增强方案可以提高 SOA 的 DSC 至 67 %(RGB)和 90 %(HSI),使其在真实 OOD 测试数据上的分布性能与分布性能相当。我们的增强方案简单而有效,使其成为解决几何域转移在语义场景分割 intraoperative 数据中几何分布差异的问题的一种有价值的网络无关工具。我们的代码和预训练模型将公开可用。

URL

https://arxiv.org/abs/2303.10972

PDF

https://arxiv.org/pdf/2303.10972


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot