AstroM3: A Self-supervised Multimodal Model for Astronomy

While machine-learned models are now routinely employed to facilitate astronomical inquiry, model inputs tend to be limited to a primary data source (namely images or time series) and, in the more advanced approaches, some metadata. Yet with the growing use of wide-field, multiplexed observational r...

Full description

Saved in:
Bibliographic Details
Main Authors: M. Rizhko, J. S. Bloom
Format: Article
Language:English
Published: IOP Publishing 2025-01-01
Series:The Astronomical Journal
Subjects:
Online Access:https://doi.org/10.3847/1538-3881/adcbad
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850123946477748224
author M. Rizhko
J. S. Bloom
author_facet M. Rizhko
J. S. Bloom
author_sort M. Rizhko
collection DOAJ
description While machine-learned models are now routinely employed to facilitate astronomical inquiry, model inputs tend to be limited to a primary data source (namely images or time series) and, in the more advanced approaches, some metadata. Yet with the growing use of wide-field, multiplexed observational resources, individual sources of interest often have a broad range of observational modes available. Here we construct an astronomical multimodal dataset and propose AstroM ^3 , a self-supervised pretraining approach that enables a model to learn from multiple modalities simultaneously. We extend the Contrastive Language-Image Pretraining (CLIP) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata. In a fine-tuning supervised setting, CLIP pretraining improves classification accuracy, particularly when labeled data is limited, with increases of up to 14.29% in spectra classification, 2.27% in metadata, and 10.20% in photometry. Furthermore, we show that combining photometry, spectra, and metadata improves classification accuracy over single-modality models. In addition to fine-tuned classification, we can use the trained model in other downstream tasks that are not explicitly contemplated during the construction of the self-supervised model. In particular we show the efficacy of using the learned embeddings to identify misclassifications, for similarity search, and for anomaly detection. One surprising highlight is the “rediscovery” of Mira subtypes and two rotational variable subclasses using manifold learning and dimensionality reduction algorithms. To our knowledge this is the first construction of an n  > 2 mode model in astronomy. Extensions to n  > 3 modes are naturally anticipated with this approach.
format Article
id doaj-art-0ea38591e1a7461f8574f06db5541434
institution OA Journals
issn 1538-3881
language English
publishDate 2025-01-01
publisher IOP Publishing
record_format Article
series The Astronomical Journal
spelling doaj-art-0ea38591e1a7461f8574f06db55414342025-08-20T02:34:28ZengIOP PublishingThe Astronomical Journal1538-38812025-01-0117012810.3847/1538-3881/adcbadAstroM3: A Self-supervised Multimodal Model for AstronomyM. Rizhko0https://orcid.org/0000-0003-3885-4661J. S. Bloom1https://orcid.org/0000-0002-7777-216XUniversity of California , Berkeley, Department of Astronomy, Berkeley, CA, USAUniversity of California , Berkeley, Department of Astronomy, Berkeley, CA, USA; Lawrence Berkeley National Laboratory , Berkeley, CA, USAWhile machine-learned models are now routinely employed to facilitate astronomical inquiry, model inputs tend to be limited to a primary data source (namely images or time series) and, in the more advanced approaches, some metadata. Yet with the growing use of wide-field, multiplexed observational resources, individual sources of interest often have a broad range of observational modes available. Here we construct an astronomical multimodal dataset and propose AstroM ^3 , a self-supervised pretraining approach that enables a model to learn from multiple modalities simultaneously. We extend the Contrastive Language-Image Pretraining (CLIP) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata. In a fine-tuning supervised setting, CLIP pretraining improves classification accuracy, particularly when labeled data is limited, with increases of up to 14.29% in spectra classification, 2.27% in metadata, and 10.20% in photometry. Furthermore, we show that combining photometry, spectra, and metadata improves classification accuracy over single-modality models. In addition to fine-tuned classification, we can use the trained model in other downstream tasks that are not explicitly contemplated during the construction of the self-supervised model. In particular we show the efficacy of using the learned embeddings to identify misclassifications, for similarity search, and for anomaly detection. One surprising highlight is the “rediscovery” of Mira subtypes and two rotational variable subclasses using manifold learning and dimensionality reduction algorithms. To our knowledge this is the first construction of an n  > 2 mode model in astronomy. Extensions to n  > 3 modes are naturally anticipated with this approach.https://doi.org/10.3847/1538-3881/adcbadAstrostatistics techniquesVariable stars
spellingShingle M. Rizhko
J. S. Bloom
AstroM3: A Self-supervised Multimodal Model for Astronomy
The Astronomical Journal
Astrostatistics techniques
Variable stars
title AstroM3: A Self-supervised Multimodal Model for Astronomy
title_full AstroM3: A Self-supervised Multimodal Model for Astronomy
title_fullStr AstroM3: A Self-supervised Multimodal Model for Astronomy
title_full_unstemmed AstroM3: A Self-supervised Multimodal Model for Astronomy
title_short AstroM3: A Self-supervised Multimodal Model for Astronomy
title_sort astrom3 a self supervised multimodal model for astronomy
topic Astrostatistics techniques
Variable stars
url https://doi.org/10.3847/1538-3881/adcbad
work_keys_str_mv AT mrizhko astrom3aselfsupervisedmultimodalmodelforastronomy
AT jsbloom astrom3aselfsupervisedmultimodalmodelforastronomy