Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification

Light curves serve as a valuable source of information on stellar formation and evolution. With the rapid advancement of machine learning techniques, they can be effectively processed to extract astronomical patterns and information. In this study, we present a comprehensive evaluation of models bas...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yu-Yang Li, Yu Bai, Cunshi Wang, Mengwei Qu, Ziteng Lu, Roberto Soria, Jifeng Liu
Format:	Article
Language:	English
Published:	American Association for the Advancement of Science (AAAS) 2025-01-01
Series:	Intelligent Computing
Online Access:	https://spj.science.org/doi/10.34133/icomputing.0110
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850141303110631424
author	Yu-Yang Li Yu Bai Cunshi Wang Mengwei Qu Ziteng Lu Roberto Soria Jifeng Liu
author_facet	Yu-Yang Li Yu Bai Cunshi Wang Mengwei Qu Ziteng Lu Roberto Soria Jifeng Liu
author_sort	Yu-Yang Li
collection	DOAJ
description	Light curves serve as a valuable source of information on stellar formation and evolution. With the rapid advancement of machine learning techniques, they can be effectively processed to extract astronomical patterns and information. In this study, we present a comprehensive evaluation of models based on deep learning and large language models (LLMs) for the automatic classification of variable star light curves, using large datasets from the Kepler and K2 missions. Special emphasis is placed on Cepheids, RR Lyrae, and eclipsing binaries, examining the influence of observational cadence and phase distribution on classification precision. Employing automated deep learning optimization, we achieve striking performance using 2 architectures: one that combines one-dimensional convolution (Conv1D) with bidirectional long short-term memory (BiLSTM) and another called the Swin Transformer. These achieved accuracies of 94% and 99%, respectively, with the latter demonstrating a notable 83% accuracy in discerning the elusive type II Cepheids that comprise merely 0.02% of the total dataset. We unveil StarWhisper LightCurve (LC), a series of 3 LLM models based on an LLM, a multimodal large language model (MLLM), and a large audio language model (LALM). Each model is fine-tuned with strategic prompt engineering and customized training methods to explore the emergent abilities of these models for astronomical data. Remarkably, StarWhisper LC series models exhibit high accuracies of around 90%, considerably reducing the need for explicit feature engineering, thereby paving the way for streamlined parallel data processing and the progression of multifaceted multimodal models in astronomical applications. The study furnishes 2 detailed catalogs illustrating the impacts of phase and sampling intervals on deep learning classification accuracy, showing that a substantial decrease of up to 14% in observation duration and 21% in sampling points can be realized without compromising accuracy by more than 10%.
format	Article
id	doaj-art-a656fc6b1b9e4c099556890ecbb9c80d
institution	OA Journals
issn	2771-5892
language	English
publishDate	2025-01-01
publisher	American Association for the Advancement of Science (AAAS)
record_format	Article
series	Intelligent Computing
spelling	doaj-art-a656fc6b1b9e4c099556890ecbb9c80d2025-08-20T02:29:29ZengAmerican Association for the Advancement of Science (AAAS)Intelligent Computing2771-58922025-01-01410.34133/icomputing.0110Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve ClassificationYu-Yang Li0Yu Bai1Cunshi Wang2Mengwei Qu3Ziteng Lu4Roberto Soria5Jifeng Liu6Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China.Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China.Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China.State Key Laboratory of Isotope Geochemistry, Guangzhou Institute of Geochemistry, Chinese Academy of Sciences, Guangzhou 510640, China.School of Foreign Studies, Tongling University, Tongling, Anhui 244061, China.College of Astronomy and Space Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.Key Laboratory of Optical Astronomy, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China.Light curves serve as a valuable source of information on stellar formation and evolution. With the rapid advancement of machine learning techniques, they can be effectively processed to extract astronomical patterns and information. In this study, we present a comprehensive evaluation of models based on deep learning and large language models (LLMs) for the automatic classification of variable star light curves, using large datasets from the Kepler and K2 missions. Special emphasis is placed on Cepheids, RR Lyrae, and eclipsing binaries, examining the influence of observational cadence and phase distribution on classification precision. Employing automated deep learning optimization, we achieve striking performance using 2 architectures: one that combines one-dimensional convolution (Conv1D) with bidirectional long short-term memory (BiLSTM) and another called the Swin Transformer. These achieved accuracies of 94% and 99%, respectively, with the latter demonstrating a notable 83% accuracy in discerning the elusive type II Cepheids that comprise merely 0.02% of the total dataset. We unveil StarWhisper LightCurve (LC), a series of 3 LLM models based on an LLM, a multimodal large language model (MLLM), and a large audio language model (LALM). Each model is fine-tuned with strategic prompt engineering and customized training methods to explore the emergent abilities of these models for astronomical data. Remarkably, StarWhisper LC series models exhibit high accuracies of around 90%, considerably reducing the need for explicit feature engineering, thereby paving the way for streamlined parallel data processing and the progression of multifaceted multimodal models in astronomical applications. The study furnishes 2 detailed catalogs illustrating the impacts of phase and sampling intervals on deep learning classification accuracy, showing that a substantial decrease of up to 14% in observation duration and 21% in sampling points can be realized without compromising accuracy by more than 10%.https://spj.science.org/doi/10.34133/icomputing.0110
spellingShingle	Yu-Yang Li Yu Bai Cunshi Wang Mengwei Qu Ziteng Lu Roberto Soria Jifeng Liu Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification Intelligent Computing
title	Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification
title_full	Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification
title_fullStr	Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification
title_full_unstemmed	Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification
title_short	Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification
title_sort	deep learning and methods based on large language models applied to stellar light curve classification
url	https://spj.science.org/doi/10.34133/icomputing.0110
work_keys_str_mv	AT yuyangli deeplearningandmethodsbasedonlargelanguagemodelsappliedtostellarlightcurveclassification AT yubai deeplearningandmethodsbasedonlargelanguagemodelsappliedtostellarlightcurveclassification AT cunshiwang deeplearningandmethodsbasedonlargelanguagemodelsappliedtostellarlightcurveclassification AT mengweiqu deeplearningandmethodsbasedonlargelanguagemodelsappliedtostellarlightcurveclassification AT zitenglu deeplearningandmethodsbasedonlargelanguagemodelsappliedtostellarlightcurveclassification AT robertosoria deeplearningandmethodsbasedonlargelanguagemodelsappliedtostellarlightcurveclassification AT jifengliu deeplearningandmethodsbasedonlargelanguagemodelsappliedtostellarlightcurveclassification

Deep Learning and Methods Based on Large Language Models Applied to Stellar Light Curve Classification

Similar Items