Dusty Stellar Source Classification by Implementing Machine Learning Methods Based on Spectroscopic Observations in the Magellanic Clouds

Dusty stellar point sources are a significant stage in stellar evolution and contribute to the metal enrichment of galaxies. These objects can be classified using photometric and spectroscopic observations using color–magnitude diagrams and infrared excesses in spectral energy distributions. We have...

Full description

Saved in:
Bibliographic Details
Main Authors: Sepideh Ghaziasgar, Mahdi Abdollahi, Atefeh Javadi, Jacco Th. van Loon, Iain McDonald, Joana Oliveira, Amirhossein Masoudnezhad, Habib G. Khosroshahi, Bernard H. Foing, Fatemeh Fazel Hesar
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:The Astrophysical Journal
Subjects:
Online Access:https://doi.org/10.3847/1538-4357/adceeb
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dusty stellar point sources are a significant stage in stellar evolution and contribute to the metal enrichment of galaxies. These objects can be classified using photometric and spectroscopic observations using color–magnitude diagrams and infrared excesses in spectral energy distributions. We have employed supervised machine learning spectral classification to categorize dusty stellar point sources, including young stellar objects (YSOs) and evolved stars comprising oxygen- and carbon-rich asymptotic giant branch stars (AGBs), red supergiants (RSGs), and post-AGB stars (PAGBs) in the Large and Small Magellanic Clouds, based on spectroscopic labeled data derived from the Surveying the Agents of Galaxy Evolution (SAGE) project, which involved 12 multiwavelength filters and 618 stellar objects. Despite dealing with missing values and uncertainties in the SAGE spectral data sets, we achieved accurate classifications of these sources. To address the challenge of working with small and imbalanced spectral catalogs, we utilized the Synthetic Minority Oversampling Technique (SMOTE), which generates synthetic data points. Subsequently, among all the models applied before and after data augmentation, the probabilistic random forest (PRF) classifier, a tuned random forest, demonstrated the highest total accuracy, reaching 89% based on the recall metric in categorizing dusty stellar sources. In this study, using the SMOTE technique does not improve the accuracy of the best model for the CAGB, PAGB, and RSG classes; it stays at 100%, 100%, and 88%, respectively. However, there are variations in the OAGB and YSO classes. Accordingly, we collected photometrically labeled data with properties similar to the training data set and classified them using the top four PRF models with an accuracy of more than 87%. We also collected multiwavelength data from several studies to classify them using our consensus model, which integrates the four top models to present common labels as the final prediction.
ISSN:1538-4357