LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars

With the increasing amount of astronomical observation data, it is an inevitable trend to use artificial intelligence methods for automatic analysis and identification of light curves for full samples. However, data sets covering all known classes of variable stars that meet all research needs are n...

Full description

Saved in:
Bibliographic Details
Main Authors: Ya Fei, Ce Yu, Kun Li, Xiaodian Chen, Yajie Zhang, Chenzhou Cui, Jian Xiao, Yunfei Xu, Yihan Tao
Format: Article
Language:English
Published: IOP Publishing 2024-01-01
Series:The Astrophysical Journal Supplement Series
Subjects:
Online Access:https://doi.org/10.3847/1538-4365/ad785b
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850176968689975296
author Ya Fei
Ce Yu
Kun Li
Xiaodian Chen
Yajie Zhang
Chenzhou Cui
Jian Xiao
Yunfei Xu
Yihan Tao
author_facet Ya Fei
Ce Yu
Kun Li
Xiaodian Chen
Yajie Zhang
Chenzhou Cui
Jian Xiao
Yunfei Xu
Yihan Tao
author_sort Ya Fei
collection DOAJ
description With the increasing amount of astronomical observation data, it is an inevitable trend to use artificial intelligence methods for automatic analysis and identification of light curves for full samples. However, data sets covering all known classes of variable stars that meet all research needs are not yet available. There is still a lack of standard training data sets specifically designed for any type of light-curve classification, but existing light-curve training sets or data sets cannot be directly merged into a large collection. Based on the open data sets of the All-Sky Automated Survey for SuperNovae, Gaia, and Zwicky Transient Facility, we construct a compatible light-curve data set named LEAVES for automated recognition of variable stars, which can be used for training and testing new classification algorithms. The data set contains a total of 977,953 variable and 134,592 nonvariable light curves, in which the supported variables are divided into six superclasses and nine subclasses. We validate the compatibility of the data set through experiments and employ it to train a hierarchical random forest classifier, which achieves a weighted average F1-score of 0.95 for seven-class classification and 0.93 for 10-class classification. Experimental results prove that the classifier is more compatible than the classifier established based on a single band and a single survey, and has wider applicability while ensuring classification accuracy, which means it can be directly applied to different data types with only a relatively small loss in performance compared to a dedicated model.
format Article
id doaj-art-e85d18e06f52440fa39ed133dc3b11b6
institution OA Journals
issn 0067-0049
language English
publishDate 2024-01-01
publisher IOP Publishing
record_format Article
series The Astrophysical Journal Supplement Series
spelling doaj-art-e85d18e06f52440fa39ed133dc3b11b62025-08-20T02:19:07ZengIOP PublishingThe Astrophysical Journal Supplement Series0067-00492024-01-0127511010.3847/1538-4365/ad785bLEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable StarsYa Fei0https://orcid.org/0009-0009-6603-416XCe Yu1https://orcid.org/0000-0003-2416-4547Kun Li2https://orcid.org/0000-0003-0324-0344Xiaodian Chen3https://orcid.org/0000-0001-7084-0484Yajie Zhang4https://orcid.org/0000-0003-2976-8198Chenzhou Cui5https://orcid.org/0000-0002-7456-1826Jian Xiao6https://orcid.org/0000-0003-0978-1280Yunfei Xu7https://orcid.org/0000-0002-7397-811XYihan Tao8https://orcid.org/0000-0002-3143-9337College of Intelligence and Computing, Tianjin University , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China ; yuce@tju.edu.cn; Technical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of ChinaCollege of Intelligence and Computing, Tianjin University , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China ; yuce@tju.edu.cn; Technical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of ChinaCollege of Intelligence and Computing, Tianjin University , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China ; yuce@tju.edu.cn; Technical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of ChinaNational Astronomical Observatories, Chinese Academy of Sciences , No. 20 Datun Road, Chaoyang District, Beijing 100012, People's Republic of ChinaCollege of Intelligence and Computing, Tianjin University , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China ; yuce@tju.edu.cn; Technical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of ChinaTechnical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China; National Astronomical Observatories, Chinese Academy of Sciences , No. 20 Datun Road, Chaoyang District, Beijing 100012, People's Republic of ChinaCollege of Intelligence and Computing, Tianjin University , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China ; yuce@tju.edu.cn; Technical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of ChinaTechnical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China; National Astronomical Observatories, Chinese Academy of Sciences , No. 20 Datun Road, Chaoyang District, Beijing 100012, People's Republic of ChinaTechnical R&D Innovation Center, National Astronomical Data Center , No. 135 Yaguan Road, Haihe Education Park, Tianjin 300350, People's Republic of China; National Astronomical Observatories, Chinese Academy of Sciences , No. 20 Datun Road, Chaoyang District, Beijing 100012, People's Republic of ChinaWith the increasing amount of astronomical observation data, it is an inevitable trend to use artificial intelligence methods for automatic analysis and identification of light curves for full samples. However, data sets covering all known classes of variable stars that meet all research needs are not yet available. There is still a lack of standard training data sets specifically designed for any type of light-curve classification, but existing light-curve training sets or data sets cannot be directly merged into a large collection. Based on the open data sets of the All-Sky Automated Survey for SuperNovae, Gaia, and Zwicky Transient Facility, we construct a compatible light-curve data set named LEAVES for automated recognition of variable stars, which can be used for training and testing new classification algorithms. The data set contains a total of 977,953 variable and 134,592 nonvariable light curves, in which the supported variables are divided into six superclasses and nine subclasses. We validate the compatibility of the data set through experiments and employ it to train a hierarchical random forest classifier, which achieves a weighted average F1-score of 0.95 for seven-class classification and 0.93 for 10-class classification. Experimental results prove that the classifier is more compatible than the classifier established based on a single band and a single survey, and has wider applicability while ensuring classification accuracy, which means it can be directly applied to different data types with only a relatively small loss in performance compared to a dedicated model.https://doi.org/10.3847/1538-4365/ad785bVariable starsLight curve classificationAstronomy databases
spellingShingle Ya Fei
Ce Yu
Kun Li
Xiaodian Chen
Yajie Zhang
Chenzhou Cui
Jian Xiao
Yunfei Xu
Yihan Tao
LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars
The Astrophysical Journal Supplement Series
Variable stars
Light curve classification
Astronomy databases
title LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars
title_full LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars
title_fullStr LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars
title_full_unstemmed LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars
title_short LEAVES: An Expandable Light-curve Data Set for Automatic Classification of Variable Stars
title_sort leaves an expandable light curve data set for automatic classification of variable stars
topic Variable stars
Light curve classification
Astronomy databases
url https://doi.org/10.3847/1538-4365/ad785b
work_keys_str_mv AT yafei leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT ceyu leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT kunli leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT xiaodianchen leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT yajiezhang leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT chenzhoucui leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT jianxiao leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT yunfeixu leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars
AT yihantao leavesanexpandablelightcurvedatasetforautomaticclassificationofvariablestars