Abstract
Digital pathology has significantly advanced disease detection and pathologist efficiency through the analysis of gigapixel whole-slide images (WSI). In this process, WSIs are first divided into patches, for which a feature extractor model is applied to obtain feature vectors, which are subsequently processed by an aggregation model to predict the respective WSI label. With the rapid evolution of representation learning, numerous new feature extractor models, often termed foundational models, have emerged. Traditional evaluation methods, however, rely on fixed aggregation model hyperparameters, a framework we identify as potentially biasing the results. Our study uncovers a co-dependence between feature extractor models and aggregation model hyperparameters, indicating that performance comparability can be skewed based on the chosen hyperparameters. By accounting for this co-dependency, we find that the performance of many current feature extractor models is notably similar. We support this insight by evaluating seven feature extractor models across three different datasets with 162 different aggregation model configurations. This comprehensive approach provides a more nuanced understanding of the relationship between feature extractors and aggregation models, leading to a fairer and more accurate assessment of feature extractor models in digital pathology.
Abstract (translated)
数字病理学通过分析 gigapixel Whole-Slide Images (WSI) 显著提高了疾病检测和病理学家效率。在這個過程中,WSIs 首先被分為斑塊,對其應用一個特徵提取器模型獲得特徵向量,然後由聚合模型進行後續處理以預測相應的 WSI 標籤。隨著表示學習的快速發展,出現了許多新的特徵提取器模型,通常稱為基礎模型。然而,傳統評估方法依賴於固定的聚合模型超參數,這種框架我們認為可能偏颇結果。我們的研究揭示了特徵提取器模型和聚合模型超參數之間的共同依賴關係,表明性能可讀性可能基於選擇的超參數而有所偏差。通過考慮這種共同依賴關係,我們發現許多现有特徵提取器模型的性能非常相似。我們通過在三個不同的數據集上評估七個特徵提取器模型,with 162 different aggregation model configurations,來驗證這個見解。這種全面的方法提供了一個更精確的視角,說明了特徵提取器和聚合模型之間的關係,有助於更公平和準確地評估數字病理學中的特徵提取器模型。
URL
https://arxiv.org/abs/2311.17804