Abstract
The delineation of tumor target and organs-at-risk is critical in the radiotherapy treatment planning. Automatic segmentation can be used to reduce the physician workload and improve the consistency. However, the quality assurance of the automatic segmentation is still an unmet need in clinical practice. The patient data used in our study was a standardized dataset from AAPM Thoracic Auto-Segmentation Challenge. The OARs included were left and right lungs, heart, esophagus, and spinal cord. Two groups of OARs were generated, the benchmark dataset manually contoured by experienced physicians and the test dataset automatically created using a software AccuContour. A resnet-152 network was performed as feature extractor, and one-class support vector classifier was used to determine the high or low quality. We evaluate the model performance with balanced accuracy, F-score, sensitivity, specificity and the area under the receiving operator characteristic curve. We randomly generated contour errors to assess the generalization of our method, explored the detection limit, and evaluated the correlations between detection limit and various metrics such as volume, Dice similarity coefficient, Hausdorff distance, and mean surface distance. The proposed one-class classifier outperformed in metrics such as balanced accuracy, AUC, and others. The proposed method showed significant improvement over binary classifiers in handling various types of errors. Our proposed model, which introduces residual network and attention mechanism in the one-class classification framework, was able to detect the various types of OAR contour errors with high accuracy. The proposed method can significantly reduce the burden of physician review for contour delineation.
Abstract (translated)
肿瘤靶区和受威胁器官的划分在放射治疗治疗规划中非常重要。自动分割可用于减轻医生的工作负担并提高一致性。然而,在临床实践中自动分割的质量保证仍然是一个未满足的需求。我们研究中的患者数据是从AAPM Thoracic Auto-Segmentation Challenge标准化数据集中获得的。包括左肺、右肺、食管和脊髓的OAR。通过手动绘制经过经验丰富的医生的轮廓,产生了两个OAR组,分别是基准数据集和通过软件AccuContour自动生成的测试数据集。使用ResNet-152网络作为特征提取器,并使用单类支持向量分类器确定高或低质量。我们通过平衡精度、F1分数、敏感性、特异性以及接收者操作特征曲线下面积来评估模型的性能。我们随机生成轮廓误差以评估我们的方法的泛化能力,探索了检测限,并评估了轮廓检测限与各种指标(如体积、Dice相似性系数、Hausdorff距离和平均表面距离)之间的相关性。与二分类分类器相比,所提出的单分类分类器在诸如平衡精度、AUC等指标上表现出显著的优越性。所提出的模型在处理各种类型的错误方面显著优于二分类分类器。我们提出的模型,在单分类分类器框架中引入了残差网络和注意机制,能够高精度地检测各种OAR轮廓误差。所提出的方法可以在很大程度上减轻医生对轮廓划分的负担。
URL
https://arxiv.org/abs/2405.11732