Paper Reading AI Learner

Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification

2024-04-16 17:35:25
Yu-Yang Li, Yu Bai, Cunshi Wang, Mengwei Qu, Ziteng Lu, Roberto Soria, Jifeng Liu

Abstract

Light curves serve as a valuable source of information on stellar formation and evolution. With the rapid advancement of machine learning techniques, it can be effectively processed to extract astronomical patterns and information. In this study, we present a comprehensive evaluation of deep-learning and large language model (LLM) based models for the automatic classification of variable star light curves, based on large datasets from the Kepler and K2 missions. Special emphasis is placed on Cepheids, RR Lyrae, and eclipsing binaries, examining the influence of observational cadence and phase distribution on classification precision. Employing AutoDL optimization, we achieve striking performance with the 1D-Convolution+BiLSTM architecture and the Swin Transformer, hitting accuracies of 94\% and 99\% correspondingly, with the latter demonstrating a notable 83\% accuracy in discerning the elusive Type II Cepheids-comprising merely 0.02\% of the total dataset.We unveil StarWhisper LightCurve (LC), an innovative Series comprising three LLM-based models: LLM, multimodal large language model (MLLM), and Large Audio Language Model (LALM). Each model is fine-tuned with strategic prompt engineering and customized training methods to explore the emergent abilities of these models for astronomical data. Remarkably, StarWhisper LC Series exhibit high accuracies around 90\%, significantly reducing the need for explicit feature engineering, thereby paving the way for streamlined parallel data processing and the progression of multifaceted multimodal models in astronomical applications. The study furnishes two detailed catalogs illustrating the impacts of phase and sampling intervals on deep learning classification accuracy, showing that a substantial decrease of up to 14\% in observation duration and 21\% in sampling points can be realized without compromising accuracy by more than 10\%.

Abstract (translated)

光曲线作为一种关于恒星形成和演化的宝贵信息来源,随着机器学习技术的快速发展,可以有效地处理以提取天文模式和信息。在这项研究中,我们全面评估了基于深度学习和大型语言模型(LLM)的变星光曲线自动分类模型的性能,基于Kepler和K2任务的大数据集。特别关注Cepheids、RR Lyrae和食人鱼 binary,研究了观测序列和相位分布对分类精度的影响。采用AutoDL优化,我们通过1D-卷积加BiLSTM架构和Swin Transformer取得了显著的性能,前者的准确度为94%,后者则表现出对Type II Cepheids的显著83%的判断能力,前者的最高准确度达到99%,后者的准确性仅为0.02%的整个数据集中的样本总量。我们揭示了StarWhisper LightCurve(LC)系列,这是一种创新的三模型系列:LLM、多模态大型语言模型(MLLM)和大型音频语言模型(LALM)。每个模型都通过战略提示工程和定制化训练方法进行了微调,以探索这些模型在天文数据中产生的新兴能力。值得注意的是,StarWhisper LC系列在准确度方面表现出高达90%的准确度,从而显著减少了不需要的显式特征工程,为简化并行数据处理和多面体多模态模型在天文应用中的发展铺平了道路。这项研究提供了两个详细的目录,说明了相位和采样间隔对深度学习分类准确度的影响,表明在不过度妥协准确度的情况下,可以通过减小观测持续时间和采样点的数量来降低观察持续时间至14%,采样点至21%。

URL

https://arxiv.org/abs/2404.10757

PDF

https://arxiv.org/pdf/2404.10757.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model LLM Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Robot Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot