High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, Indonesia

Accurate forecasting of high-resolution particulate matter 2.5 (PM<sub>2.5</sub>) levels is essential for the development of public health policy. However, datasets used for this purpose often contain missing observations. This study presents a two-stage approach to handle this problem....

Full description

Saved in:
Bibliographic Details
Main Authors: I Gede Nyoman Mindra Jaya, Henk Folmer
Format: Article
Language:English
Published: MDPI AG 2024-09-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/12/18/2899
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850260466200215552
author I Gede Nyoman Mindra Jaya
Henk Folmer
author_facet I Gede Nyoman Mindra Jaya
Henk Folmer
author_sort I Gede Nyoman Mindra Jaya
collection DOAJ
description Accurate forecasting of high-resolution particulate matter 2.5 (PM<sub>2.5</sub>) levels is essential for the development of public health policy. However, datasets used for this purpose often contain missing observations. This study presents a two-stage approach to handle this problem. The first stage is a multivariate spatial time series (MSTS) model, used to generate forecasts for the sampled spatial units and to impute missing observations. The MSTS model utilizes the similarities between the temporal patterns of the time series of the spatial units to impute the missing data across space. The second stage is the high-resolution prediction model, which generates predictions that cover the entire study domain. The second stage faces the big N problem giving rise to complex memory and computational problems. As a solution to the big N problem, we propose a Gaussian Markov random field (GMRF) for innovations with the Matérn covariance matrix obtained from the corresponding Gaussian field (GF) matrix by means of the stochastic partial differential equation (SPDE) method and the finite element method (FEM). For inference, we propose Bayesian statistics and integrated nested Laplace approximation (INLA) in the R-INLA package. The above approach is demonstrated using daily data collected from 13 PM<sub>2.5</sub> monitoring stations in Jakarta Province, Indonesia, for 1 January–31 December 2022. The first stage of the model generates PM<sub>2.5</sub> forecasts for the 13 monitoring stations for the period 1–31 January 2023, imputing missing data by means of the MSTS model. To capture temporal trends in the PM<sub>2.5</sub> concentrations, the model applies a first-order autoregressive process and a seasonal process. The second stage involves creating a high-resolution map for the period 1–31 January 2023, for sampled and non-sampled spatiotemporal units. It uses the MSTS-generated PM<sub>2.5</sub> predictions for the sampled spatiotemporal units and observations of the covariate’s altitude, population density, and rainfall for sampled and non-samples spatiotemporal units. For the spatially correlated random effects, we apply a first-order random walk process. The validation of out-of-sample forecasts indicates a strong model fit with low mean squared error (0.001), mean absolute error (0.037), and mean absolute percentage error (0.041), and a high R² value (0.855). The analysis reveals that altitude and precipitation negatively impact PM<sub>2.5</sub> concentrations, while population density has a positive effect. Specifically, a one-meter increase in altitude is linked to a 7.8% decrease in PM<sub>2.5</sub>, while a one-person increase in population density leads to a 7.0% rise in PM<sub>2.5</sub>. Additionally, a one-millimeter increase in rainfall corresponds to a 3.9% decrease in PM<sub>2.5</sub>. The paper makes a valuable contribution to the field of forecasting high-resolution PM<sub>2.5</sub> levels, which is essential for providing detailed, accurate information for public health policy. The approach presents a new and innovative method for addressing the problem of missing data and high-resolution forecasting.
format Article
id doaj-art-67e77d9d1e4140548c0d06a4196fabbd
institution OA Journals
issn 2227-7390
language English
publishDate 2024-09-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-67e77d9d1e4140548c0d06a4196fabbd2025-08-20T01:55:38ZengMDPI AGMathematics2227-73902024-09-011218289910.3390/math12182899High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, IndonesiaI Gede Nyoman Mindra Jaya0Henk Folmer1Department of Statistics, Universitas Padjadjaran, Jl. Raya Bandung Sumedang km 21 Jatinangor, Sumedang 45363, IndonesiaDepartment of Statistics, Universitas Padjadjaran, Jl. Raya Bandung Sumedang km 21 Jatinangor, Sumedang 45363, IndonesiaAccurate forecasting of high-resolution particulate matter 2.5 (PM<sub>2.5</sub>) levels is essential for the development of public health policy. However, datasets used for this purpose often contain missing observations. This study presents a two-stage approach to handle this problem. The first stage is a multivariate spatial time series (MSTS) model, used to generate forecasts for the sampled spatial units and to impute missing observations. The MSTS model utilizes the similarities between the temporal patterns of the time series of the spatial units to impute the missing data across space. The second stage is the high-resolution prediction model, which generates predictions that cover the entire study domain. The second stage faces the big N problem giving rise to complex memory and computational problems. As a solution to the big N problem, we propose a Gaussian Markov random field (GMRF) for innovations with the Matérn covariance matrix obtained from the corresponding Gaussian field (GF) matrix by means of the stochastic partial differential equation (SPDE) method and the finite element method (FEM). For inference, we propose Bayesian statistics and integrated nested Laplace approximation (INLA) in the R-INLA package. The above approach is demonstrated using daily data collected from 13 PM<sub>2.5</sub> monitoring stations in Jakarta Province, Indonesia, for 1 January–31 December 2022. The first stage of the model generates PM<sub>2.5</sub> forecasts for the 13 monitoring stations for the period 1–31 January 2023, imputing missing data by means of the MSTS model. To capture temporal trends in the PM<sub>2.5</sub> concentrations, the model applies a first-order autoregressive process and a seasonal process. The second stage involves creating a high-resolution map for the period 1–31 January 2023, for sampled and non-sampled spatiotemporal units. It uses the MSTS-generated PM<sub>2.5</sub> predictions for the sampled spatiotemporal units and observations of the covariate’s altitude, population density, and rainfall for sampled and non-samples spatiotemporal units. For the spatially correlated random effects, we apply a first-order random walk process. The validation of out-of-sample forecasts indicates a strong model fit with low mean squared error (0.001), mean absolute error (0.037), and mean absolute percentage error (0.041), and a high R² value (0.855). The analysis reveals that altitude and precipitation negatively impact PM<sub>2.5</sub> concentrations, while population density has a positive effect. Specifically, a one-meter increase in altitude is linked to a 7.8% decrease in PM<sub>2.5</sub>, while a one-person increase in population density leads to a 7.0% rise in PM<sub>2.5</sub>. Additionally, a one-millimeter increase in rainfall corresponds to a 3.9% decrease in PM<sub>2.5</sub>. The paper makes a valuable contribution to the field of forecasting high-resolution PM<sub>2.5</sub> levels, which is essential for providing detailed, accurate information for public health policy. The approach presents a new and innovative method for addressing the problem of missing data and high-resolution forecasting.https://www.mdpi.com/2227-7390/12/18/2899multivariate spatial time series modelGaussian Markov random field (GMRF)high-resolution forecastingBayesian statisticsintegrated nested Laplace approximation (INLA)PM<sub>2.5</sub>
spellingShingle I Gede Nyoman Mindra Jaya
Henk Folmer
High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, Indonesia
Mathematics
multivariate spatial time series model
Gaussian Markov random field (GMRF)
high-resolution forecasting
Bayesian statistics
integrated nested Laplace approximation (INLA)
PM<sub>2.5</sub>
title High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, Indonesia
title_full High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, Indonesia
title_fullStr High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, Indonesia
title_full_unstemmed High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, Indonesia
title_short High-Resolution Spatiotemporal Forecasting with Missing Observations Including an Application to Daily Particulate Matter 2.5 Concentrations in Jakarta Province, Indonesia
title_sort high resolution spatiotemporal forecasting with missing observations including an application to daily particulate matter 2 5 concentrations in jakarta province indonesia
topic multivariate spatial time series model
Gaussian Markov random field (GMRF)
high-resolution forecasting
Bayesian statistics
integrated nested Laplace approximation (INLA)
PM<sub>2.5</sub>
url https://www.mdpi.com/2227-7390/12/18/2899
work_keys_str_mv AT igedenyomanmindrajaya highresolutionspatiotemporalforecastingwithmissingobservationsincludinganapplicationtodailyparticulatematter25concentrationsinjakartaprovinceindonesia
AT henkfolmer highresolutionspatiotemporalforecastingwithmissingobservationsincludinganapplicationtodailyparticulatematter25concentrationsinjakartaprovinceindonesia