Abstract
General detectors follow the pipeline that feature maps extracted from ConvNets are shared between classification and regression tasks. However, there exists obvious conflicting requirements in multi-orientation object detection that classification is insensitive to orientations, while regression is quite sensitive. To address this issue, we provide an Encoder-Decoder architecture, called Rotated Feature Network (RFN), which produces rotation-sensitive feature maps (RS) for regression and rotation-invariant feature maps (RI) for classification. Specifically, the Encoder unit assigns weights for rotated feature maps. The Decoder unit extracts RS and RI by performing resuming operator on rotated and reweighed feature maps, respectively. To make the rotation-invariant characteristics more reliable, we adopt a metric to quantitatively evaluate the rotation-invariance by adding a constrain item in the loss, yielding a promising detection performance. Compared with the state-of-the-art methods, our method can achieve significant improvement on NWPU VHR-10 and RSOD datasets. We further evaluate the RFN on the scene classification in remote sensing images and object detection in natural images, demonstrating its good generalization ability. The proposed RFN can be integrated into an existing framework, leading to great performance with only a slight increase in model complexity.
Abstract (translated)
一般的检测器遵循这样的流程:从convnets中提取的特征图在分类和回归任务之间共享。然而,多方位目标检测中存在着明显的冲突要求,即分类对方位不敏感,而回归则相当敏感。为了解决这个问题,我们提供了一个编码器-解码器架构,称为旋转特征网络(RFN),它为回归生成旋转敏感特征映射(RS),为分类生成旋转不变特征映射(RI)。具体来说,编码器单元为旋转特征图指定权重。译码器单元通过分别对旋转和重新称重的特征图执行恢复操作来提取RS和RI。为了使旋转不变量特性更加可靠,我们采用了一种度量方法,通过在损失中添加一个约束项来定量评估旋转不变性,从而获得了一种有前途的检测性能。与目前最先进的方法相比,我们的方法可以在nwpu vhr-10和rsod数据集上取得显著的改进。对遥感图像中的场景分类和自然图像中的目标检测进行了进一步的评价,证明了其良好的泛化能力。提出的RFN可以集成到现有的框架中,从而在只增加模型复杂性的情况下获得良好的性能。
URL
https://arxiv.org/abs/1903.09839