Weakly Supervised Object Detection with 2D and 3D Regression Neural Networks

Abstract
Abstract (translated)
URL
PDF

Abstract

Weakly supervised detection methods can infer the location of target objects in an image without requiring location or appearance information during training. We propose a weakly supervised deep learning method for the detection of objects that appear at multiple locations in an image. The method computes attention maps using the last feature maps of an encoder-decoder network optimized only with global labels: the number of occurrences of the target object in an image. In contrast with previous approaches, attention maps are generated at full input resolution thanks to the decoder part. The proposed approach is compared to multiple state-of-the-art methods in two tasks: the detection of digits in MNIST-based datasets, and the real life application of detection of enlarged perivascular spaces -- a type of brain lesion -- in four brain regions in a dataset of 2202 3D brain MRI scans. In MNIST-based datasets, the proposed method outperforms the other methods. In the brain dataset, several weakly supervised detection methods come close to the human intrarater agreement in each region. The proposed method reaches the lowest number of false positive detections in all brain regions at the operating point, while its average sensitivity is similar to that of the other best methods.

Abstract (translated)

弱监督检测方法可以在训练过程中推断目标在图像中的位置，而不需要位置或外观信息。我们提出了一种弱监督的深度学习方法来检测图像中多个位置出现的物体。该方法使用仅使用全局标签优化的编码器-解码器网络的最后一个特征映射来计算注意力映射：图像中目标对象的出现次数。与以前的方法相比，由于解码器部分的存在，注意力地图是以完全的输入分辨率生成的。该方法在两个任务中与多个最先进的方法进行了比较：基于mnist的数据集中的数字检测，以及在2202个三维脑MRI扫描数据集中的四个脑区中检测扩大的血管周围空间（一种脑损伤）的实际应用。在基于mnist的数据集中，该方法优于其他方法。在大脑数据集中，几个弱监督的检测方法在每个区域接近人类内部评级器协议。该方法在操作点所有脑区的假阳性检出率最低，平均灵敏度与其他最佳方法相似。

URL

https://arxiv.org/abs/1906.01891

PDF

https://arxiv.org/pdf/1906.01891.pdf