Positive matrix factorization of large real-time atmospheric mass spectrometry datasets using error-weighted randomized hierarchical alternating least squares

<p>Weighted positive matrix factorization (PMF) has been used by scientists to find small sets of underlying factors in environmental data. However, as the size of the data has grown, increasing computational costs have made it impractical to use traditional methods for this factorization. In...

Full description

Saved in:
Bibliographic Details
Main Authors: B. C. Sapper, S. Youn, D. K. Henze, M. Canagaratna, H. Stark, J. L. Jimenez
Format: Article
Language:English
Published: Copernicus Publications 2025-05-01
Series:Geoscientific Model Development
Online Access:https://gmd.copernicus.org/articles/18/2891/2025/gmd-18-2891-2025.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<p>Weighted positive matrix factorization (PMF) has been used by scientists to find small sets of underlying factors in environmental data. However, as the size of the data has grown, increasing computational costs have made it impractical to use traditional methods for this factorization. In this paper, we present a new external weighting method to dramatically decrease computational costs for these traditional algorithms. The external weighting scheme, along with the randomized hierarchical alternating least squares (RHALS) algorithm, was applied to the Southern Oxidant and Aerosol Study (SOAS 2013) dataset of gaseous highly oxidized multifunctional molecules (HOMs). The modified RHALS algorithm successfully reproduced six previously identified interpretable factors, with the total computation time of the nonoptimized code showing potential improvements of the order of 1 to 2 orders of magnitude compared to competing algorithms. We also investigate rotational ambiguity in the solution and present a simple “pulling” method to rotate a set of factors. This method is shown to find alternative solutions and, in some cases, lower the weighted residual error of the algorithm.</p>
ISSN:1991-959X
1991-9603