Paper Reading AI Learner

Exploring Visual Prompts for Whole Slide Image Classification with Multiple Instance Learning

2023-03-23 09:23:52
Yi Lin, Zhongchen Zhao, Zhengjie ZHU, Lisheng Wang, Kwang-Ting Cheng, Hao Chen

Abstract

Multiple instance learning (MIL) has emerged as a popular method for classifying histopathology whole slide images (WSIs). However, existing approaches typically rely on pre-trained models from large natural image datasets, such as ImageNet, to generate instance features, which can be sub-optimal due to the significant differences between natural images and histopathology images that lead to a domain shift. In this paper, we present a novel, simple yet effective method for learning domain-specific knowledge transformation from pre-trained models to histopathology images. Our approach entails using a prompt component to assist the pre-trained model in discerning differences between the pre-trained dataset and the target histopathology dataset, resulting in improved performance of MIL models. We validate our method on two publicly available datasets, Camelyon16 and TCGA-NSCLC. Extensive experimental results demonstrate the significant performance improvement of our method for different MIL models and backbones. Upon publication of this paper, we will release the source code for our method.

Abstract (translated)

多个实例学习(MIL)已经成为一种分类病理全 slide 图像(WSIs)的流行方法。然而,现有的方法通常依赖于从大型自然图像数据集(如 ImageNet)训练的模型生成实例特征,这些特征可能不如最佳水平,因为自然图像和病理图像之间存在显著的差异,导致域转换。在本文中,我们提出了一种新颖、简单但有效的方法,用于从训练模型到病理图像的知识转型学习。我们的方法是使用一个触发器组件来帮助训练模型区分训练数据和目标病理数据集之间的差异,从而改进 MIL 模型的性能。我们了两个公开数据集进行了验证,分别是Camelyon16和TCGA-NSCLC。广泛的实验结果证明了我们方法对不同 MIL 模型和骨架的显著性能改进。在本文发表后,我们将发布我们方法的源代码。

URL

https://arxiv.org/abs/2303.13122

PDF

https://arxiv.org/pdf/2303.13122.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot