Paper Reading AI Learner

Finding Strong Gravitational Lenses Through Self-Attention

2021-10-18 11:40:48
Hareesh Thuruthipilly, Adam Zadrozny, Agnieszka Pollo

Abstract

The upcoming large scale surveys are expected to find approximately $10^5$ strong gravitational systems by analyzing data of many orders of magnitude than the current era. In this scenario, non-automated techniques will be highly challenging and time-consuming. We propose a new automated architecture based on the principle of self-attention to find strong gravitational lensing. The advantages of self-attention based encoder models over convolution neural networks are investigated and encoder models are analyzed to optimize performance. We constructed 21 self-attention based encoder models and four convolution neural networks trained to identify gravitational lenses from the Bologna Lens Challenge. Each model is trained separately using 18,000 simulated images, cross-validated using 2 000 images, and then applied to a test set with 100 000 images. We used four different metrics for evaluation: classification accuracy, the area under the receiver operating characteristic curve (AUROC), the $TPR_0$ score and the $TPR_{10}$ score. The performance of the self-attention based encoder models and CNN's participated in the challenge are compared. The encoder models performed better than the CNNs and surpassed the CNN models that participated in the bologna lens challenge by a high margin for the $TPR_0$ and $TPR_{10}$. In terms of the AUROC, the encoder models scored equivalent to the top CNN model by only using one-sixth parameters to that of the CNN. Self-Attention based models have a clear advantage compared to simpler CNNs. A low computational cost and complexity make it a highly competing architecture to currently used residual neural networks. Moreover, introducing the encoder layers can also tackle the over-fitting problem present in the CNN's by acting as effective filters.

Abstract (translated)

URL

https://arxiv.org/abs/2110.09202

PDF

https://arxiv.org/pdf/2110.09202.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot