Paper Reading AI Learner

MTSGL: Multi-Task Structure Guided Learning for Robust and Interpretable SAR Aircraft Recognition

2025-04-23 07:27:08
Qishan He, Lingjun Zhao, Ru Luo, Siqian Zhang, Lin Lei, Kefeng Ji, Gangyao Kuang

Abstract

Aircraft recognition in synthetic aperture radar (SAR) imagery is a fundamental mission in both military and civilian applications. Recently deep learning (DL) has emerged a dominant paradigm for its explosive performance on extracting discriminative features. However, current classification algorithms focus primarily on learning decision hyperplane without enough comprehension on aircraft structural knowledge. Inspired by the fined aircraft annotation methods for optical remote sensing images (RSI), we first introduce a structure-based SAR aircraft annotations approach to provide structural and compositional supplement information. On this basis, we propose a multi-task structure guided learning (MTSGL) network for robust and interpretable SAR aircraft recognition. Besides the classification task, MTSGL includes a structural semantic awareness (SSA) module and a structural consistency regularization (SCR) module. The SSA is designed to capture structure semantic information, which is conducive to gain human-like comprehension of aircraft knowledge. The SCR helps maintain the geometric consistency between the aircraft structure in SAR imagery and the proposed annotation. In this process, the structural attribute can be disentangled in a geometrically meaningful manner. In conclusion, the MTSGL is presented with the expert-level aircraft prior knowledge and structure guided learning paradigm, aiming to comprehend the aircraft concept in a way analogous to the human cognitive process. Extensive experiments are conducted on a self-constructed multi-task SAR aircraft recognition dataset (MT-SARD) and the effective results illustrate the superiority of robustness and interpretation ability of the proposed MTSGL.

Abstract (translated)

合成孔径雷达(SAR)图像中的飞机识别是军事和民用应用中的基础任务。近年来,深度学习(DL)由于其在提取判别特征方面的爆炸性性能而成为主导范式。然而,当前的分类算法主要侧重于学习决策超平面,而对于飞机结构知识的理解不够深入。受光学遥感影像(RSI)中精细标注方法的启发,我们首次引入了一种基于结构的SAR飞机注释方法,以提供结构和组合补充信息。在此基础上,我们提出了一种多任务结构引导学习(MTSGL)网络,旨在实现稳健且可解释的SAR飞机识别。除了分类任务外,MTSGL还包括一个结构性语义感知(SSA)模块和一个结构性一致性正则化(SCR)模块。SSA设计用于捕获结构语义信息,有助于获得类似人类对飞机知识的理解能力。而SCR帮助保持了SAR图像中飞机结构与所提注释之间的几何一致性。在此过程中,可以以几何意义的方式解耦结构属性。总之,MTSGL采用了专家级的飞机先验知识和结构引导学习范式,旨在通过类似于人类认知过程的方式来理解飞机概念。我们在一个自构建的多任务SAR飞机识别数据集(MT-SARD)上进行了广泛的实验,并且有效的结果表明了所提出的MTSGL在稳健性和解释能力方面的优越性。

URL

https://arxiv.org/abs/2504.16467

PDF

https://arxiv.org/pdf/2504.16467.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Time_Series Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot