Abstract
Today ship hull inspection including the examination of the external coating, detection of defects, and other types of external degradation such as corrosion and marine growth is conducted underwater by means of Remotely Operated Vehicles (ROVs). The inspection process consists of a manual video analysis which is a time-consuming and labor-intensive process. To address this, we propose an automatic video analysis system using deep learning and computer vision to improve upon existing methods that only consider spatial information on individual frames in underwater ship hull video inspection. By exploring the benefits of adding temporal information and analyzing frame-based classifiers, we propose a multi-label video classification model that exploits the self-attention mechanism of transformers to capture spatiotemporal attention in consecutive video frames. Our proposed method has demonstrated promising results and can serve as a benchmark for future research and development in underwater video inspection applications.
Abstract (translated)
现代船壳检查包括外部涂层的检验、缺陷的发现以及诸如腐蚀和海洋增长等外部退化类型的检查,方法是通过远程操作车辆(ROV)进行水下船壳视频检查。检查过程包括手动视频分析,这是一个耗时且劳动力密集型的过程。为了解决这一问题,我们提出了一种使用深度学习和计算机视觉技术改进现有方法的方法,改进之处在于仅考虑 individual 帧的空间信息在水下船壳视频检查中。通过探索添加时间信息的好处并分析基于帧的分类器,我们提出了一种多标签视频分类模型,利用变压器的自注意力机制,在连续的视频帧中捕捉时间空间注意力。我们提出的这种方法已经表现出良好的结果,可以作为未来水下视频检查应用研究的基准。
URL
https://arxiv.org/abs/2305.17338