Abstract
All fields of knowledge are being impacted by Artificial Intelligence. In particular, the Deep Learning paradigm enables the development of data analysis tools that support subject matter experts in a variety of sectors, from physics up to the recognition of ancient languages. Palaeontology is now observing this trend as well. This study explores the capability of Convolutional Neural Networks (CNNs), a particular class of Deep Learning algorithms specifically crafted for computer vision tasks, to classify images of isolated fossil shark teeth gathered from online datasets as well as from the authors$'$ experience on Peruvian Miocene and Italian Pliocene fossil assemblages. The shark taxa that are included in the final, composite dataset (which consists of more than one thousand images) are representative of both extinct and extant genera, namely, Carcharhinus, Carcharias, Carcharocles, Chlamydoselachus, Cosmopolitodus, Galeocerdo, Hemipristis, Notorynchus, Prionace and Squatina. We developed a CNN, named SharkNet-X, specifically tailored on our recognition task, reaching a 5-fold cross validated mean accuracy of 0.85 to identify images containing a single shark tooth. Furthermore, we elaborated a visualization of the features extracted from images using the last dense layer of the CNN, achieved through the application of the clustering technique t-SNE. In addition, in order to understand and explain the behaviour of the CNN while giving a paleontological point of view on the results, we introduced the explainability method SHAP. To the best of our knowledge, this is the first instance in which this method is applied to the field of palaeontology. The main goal of this work is to showcase how Deep Learning techniques can aid in identifying isolated fossil shark teeth, paving the way for developing new information tools for automating the recognition and classification of fossils.
Abstract (translated)
知识领域的所有领域都受到人工智能的影响。特别是,深度学习范式使数据分析工具得以开发,支持各个领域的专家,从物理学到古语言的识别。古生物学领域现在也加入了这个趋势。这项研究探讨了卷积神经网络(CNNs)作为一种特别为计算机视觉任务而设计的深度学习算法的分类能力,将孤立的化石鲨牙齿图片从在线数据集中到作者在秘鲁米奥科新世和意大利普利奥新世化石群的经历中进行分类的能力。包括在最终综合 dataset(包含超过1000张图片)中的鲨鱼种类,都是灭绝和现存的物种,包括Carcharhinus、Carcharias、Carcharocles、Chlamydoselachus、Cosmopolitodus、Galeocerdo、Hemipristis、Notorynchus、Prionace和Squatina。我们开发了一个名为SharkNet-X的CNN,专门针对我们的识别任务,达到5倍交叉验证平均准确率0.85,识别包含单颗鲨牙齿的图片。此外,我们通过应用聚类技术t-SNE对提取图像特征进行可视化。为了了解和解释CNN在识别化石鲨牙齿时的行为,我们还引入了Shap解释方法。据我们所知,这是第一个将这种方法应用于古生物学领域的实例。本工作的主要目标是通过展示深度学习技术如何帮助识别孤立的化石鲨牙齿,为开发新的信息工具,用于自动化化石的识别和分类铺平道路。
URL
https://arxiv.org/abs/2405.04189