Paper Reading AI Learner

A Novel Pathology Foundation Model by Mayo Clinic, Charit'e, and Aignostics

2025-01-09 18:06:45
Maximilian Alber, Stephan Tietz, Jonas Dippel, Timo Milbich, Timoth\'ee Lesort, Panos Korfiatis, Moritz Kr\"ugener, Beatriz Perez Cancer, Neelay Shah, Alexander M\"ollers, Philipp Seegerer, Alexandra Carpen-Amarie, Kai Standvoss, Gabriel Dernbach, Edwin de Jong, Simon Schallenberg, Andreas Kunft, Helmut Hoffer von Ankershoffen, Gavin Schaeferle, Patrick Duffy, Matt Redlon, Philipp Jurmeister, David Horst, Lukas Ruff, Klaus-Robert M\"uller, Frederick Klauschen, Andrew Norgan

Abstract

Recent advances in digital pathology have demonstrated the effectiveness of foundation models across diverse applications. In this report, we present a novel vision foundation model based on the RudolfV approach. Our model was trained on a dataset comprising 1.2 million histopathology whole slide images, collected from two medical institutions: Mayo Clinic and Charité - Universtätsmedizin Berlin. Comprehensive evaluations show that our model achieves state-of-the-art performance across twenty-one public benchmark datasets, even though it is neither the largest model by parameter count nor by training dataset size.

Abstract (translated)

最近在数字病理学领域的进展展示了基础模型在各种应用中的有效性。在这份报告中,我们提出了一种基于RudolfV方法的新型视觉基础模型。我们的模型是在一个包含120万张组织病理学全景图像的数据集上训练出来的,这些数据来自两家医疗机构:梅奥诊所和柏林夏里特大学医学院。全面的评估显示,尽管在参数数量或训练数据集规模上并非最大,但该模型在二十一个公开基准测试数据集中均达到了最先进的性能水平。

URL

https://arxiv.org/abs/2501.05409

PDF

https://arxiv.org/pdf/2501.05409.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot