Abstract
Purpose: Automated distinct bone segmentation from CT scans is widely used in planning and navigation workflows. U-Net variants are known to provide excellent results in supervised semantic segmentation. However, in distinct bone segmentation from upper body CTs a large field of view and a computationally taxing 3D architecture are required. This leads to low-resolution results lacking detail or localisation errors due to missing spatial context when using high-resolution inputs. Methods: We propose to solve this problem by using end-to-end trainable segmentation networks that combine several 3D U-Nets working at different resolutions. Our approach, which extends and generalizes HookNet and MRN, captures spatial information at a lower resolution and skips the encoded information to the target network, which operates on smaller high-resolution inputs. We evaluated our proposed architecture against single resolution networks and performed an ablation study on information concatenation and the number of context networks. Results: Our proposed best network achieves a median DSC of 0.86 taken over all 125 segmented bone classes and reduces the confusion among similar-looking bones in different locations. These results outperform our previously published 3D U-Net baseline results on the task and distinct-bone segmentation results reported by other groups. Conclusion: The presented multi-resolution 3D U-Nets address current shortcomings in bone segmentation from upper-body CT scans by allowing for capturing a larger field of view while avoiding the cubic growth of the input pixels and intermediate computations that quickly outgrow the computational capacities in 3D. The approach thus improves the accuracy and efficiency of distinct bone segmentation from upper-body CT.
Abstract (translated)
目的:自动从CT扫描中提取明确的骨组织分割是规划和导航工作流程中广泛使用的方法。U-Net变体已知在监督语义分割中提供出色的结果。然而,从身体CT中提取明确的骨组织分割需要较大的视野和计算密集型的3D结构。这会导致使用高分辨率输入时缺乏细节或定位错误,因为缺少空间上下文。方法:我们提议通过使用端到端训练的分割网络来解决这一问题,这些网络将结合多个工作在不同分辨率下的3D U-Net。我们的方法扩展并Generalize了HookNet和MRN,它以较低的分辨率捕获空间信息,并跳过向目标网络编码的信息,该网络以较小的高分辨率输入操作。我们评估了我们提出的架构与单分辨率网络的性能,并进行了融合信息的研究,以及网络中的上下文网络数量的研究。结果:我们提出的最好的网络平均达到了0.86的DSC,涵盖了所有125个分割骨类,并减少了在不同位置类似骨骼之间的混淆。这些结果在我们之前发布的3D U-Net任务和其他小组报告的明确骨分割结果中击败了我们的先前发布的3D U-Net基准结果。结论:提出的多分辨率3D U-Net解决从身体CT扫描中提取明确的骨组织分割目前的缺陷,通过允许捕捉更大的视野,同时避免输入像素立方增长和快速的3D计算能力超出计算能力的常数增长。这种方法提高了从身体CT中提取明确的骨组织分割的准确性和效率。
URL
https://arxiv.org/abs/2301.13674