Abstract
Featurizing microscopy images for use in biological research remains a significant challenge, especially for large-scale experiments spanning millions of images. This work explores the scaling properties of weakly supervised classifiers and self-supervised masked autoencoders (MAEs) when training with increasingly larger model backbones and microscopy datasets. Our results show that ViT-based MAEs outperform weakly supervised classifiers on a variety of tasks, achieving as much as a 11.5% relative improvement when recalling known biological relationships curated from public databases. Additionally, we develop a new channel-agnostic MAE architecture (CA-MAE) that allows for inputting images of different numbers and orders of channels at inference time. We demonstrate that CA-MAEs effectively generalize by inferring and evaluating on a microscopy image dataset (JUMP-CP) generated under different experimental conditions with a different channel structure than our pretraining data (RPI-93M). Our findings motivate continued research into scaling self-supervised learning on microscopy data in order to create powerful foundation models of cellular biology that have the potential to catalyze advancements in drug discovery and beyond.
Abstract (translated)
将显微图像特征化用于生物研究仍然是一个重要的挑战,尤其是在跨越数百万张图像的大型实验中。本文探讨了在训练过程中使用越来越大模型骨干和显微数据集时,弱监督分类器和自监督掩码自动编码器(MAEs)的扩展性质。我们的结果表明,基于ViT的MAEs在各种任务上优于弱监督分类器,在回忆来自公共数据库中预先整理的已知生物学关系时,相对改进多达11.5%。此外,我们开发了一种新的通道无关MAE架构(CA-MAE),允许在推理时输入不同数量和维度的图像。我们证明了CA-MAEs通过推断和评估来有效泛化,与我们的预训练数据(RPI-93M)生成具有不同通道结构的显微图像数据集(JUMP-CP)相比。我们的研究结果激励继续研究在显微数据上进行自监督学习,以创建有潜力的细胞生物学基础模型,该模型可以促进药物发现及其他领域的进步。
URL
https://arxiv.org/abs/2404.10242