A statistical and machine learning approach for monthly precipitation forecasting in an Amazon city

IntroductionCity-scale rainfall prediction is crucial for various essential services, such as transportation, supply chain logistics, and leisure activities, as well as for preventing risks associated with high volumes of rain. Belém is a city located in northern Brazil with distinct periods of prec...

Full description

Saved in:
Bibliographic Details
Main Authors: Ewerton Cristhian Lima de Oliveira, Eduardo Costa de Carvalho, Edmir dos Santos Jesus, Rafael de Lima Rocha, Helder Moreira Arruda, Ronnie Cley de Oliveira Alves, Renata Gonçalves Tedeschi
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Earth Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/feart.2025.1589753/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850278670686486528
author Ewerton Cristhian Lima de Oliveira
Eduardo Costa de Carvalho
Edmir dos Santos Jesus
Rafael de Lima Rocha
Rafael de Lima Rocha
Helder Moreira Arruda
Ronnie Cley de Oliveira Alves
Ronnie Cley de Oliveira Alves
Renata Gonçalves Tedeschi
author_facet Ewerton Cristhian Lima de Oliveira
Eduardo Costa de Carvalho
Edmir dos Santos Jesus
Rafael de Lima Rocha
Rafael de Lima Rocha
Helder Moreira Arruda
Ronnie Cley de Oliveira Alves
Ronnie Cley de Oliveira Alves
Renata Gonçalves Tedeschi
author_sort Ewerton Cristhian Lima de Oliveira
collection DOAJ
description IntroductionCity-scale rainfall prediction is crucial for various essential services, such as transportation, supply chain logistics, and leisure activities, as well as for preventing risks associated with high volumes of rain. Belém is a city located in northern Brazil with distinct periods of precipitation, including a rainy season that directly impacts the city’s dynamics and the quality of life of its citizens, often resulting in flooding and infrastructure accidents in several city zones.MethodsMeteorological studies generally use large volumes of data; however, our study is characterized by using a data source with fewer years to predict rainfall precipitation. Additionally, we use meteorological data from a set of sensors installed at a meteorological station located in Belém to train multivariate statistical and machine learning (ML) models to predict precipitation. Besides the use of algorithms, another evaluation was conducted on Feature Composition based on statistical methods to investigate the impact of variables on the prediction.ResultsThe results obtained in our investigation indicate that the vector autoregressive moving average with exogenous regressors (VARMAX) model achieved the best performance in rainfall forecasting, with an average root mean square error (RMSE) of 9.1833 in time series cross-validation, outperforming the other models.DiscussionThe climate-driven patterns directly influenced the performance of the rainfall forecasting models evaluated in this study. As cited above, the VARMAX had the lowest avRMSE, which was obtained using a lag-1 value of exogenous variables. This is particularly noteworthy, as this same configuration not only produced the lowest RMSE for forecasts in 2022 but also highlighted the importance of relative humidity and solar radiation in enhancing predictive accuracy, even in the presence of data anomalies related to solar radiation measurements.
format Article
id doaj-art-7fe2eb92da43493f96da1aec3a0e2dc6
institution OA Journals
issn 2296-6463
language English
publishDate 2025-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Earth Science
spelling doaj-art-7fe2eb92da43493f96da1aec3a0e2dc62025-08-20T01:49:23ZengFrontiers Media S.A.Frontiers in Earth Science2296-64632025-05-011310.3389/feart.2025.15897531589753A statistical and machine learning approach for monthly precipitation forecasting in an Amazon cityEwerton Cristhian Lima de Oliveira0Eduardo Costa de Carvalho1Edmir dos Santos Jesus2Rafael de Lima Rocha3Rafael de Lima Rocha4Helder Moreira Arruda5Ronnie Cley de Oliveira Alves6Ronnie Cley de Oliveira Alves7Renata Gonçalves Tedeschi8Instituto Tecnologico Vale Desenvolvimento Sustentável, Belém, BrazilInstituto Tecnologico Vale Desenvolvimento Sustentável, Belém, BrazilInstituto Tecnologico Vale Desenvolvimento Sustentável, Belém, BrazilInstituto Tecnologico Vale Desenvolvimento Sustentável, Belém, BrazilUniversidade Federal do Pará, Computer Science Graduate Program, Belém, BrazilInstituto Tecnologico Vale Desenvolvimento Sustentável, Belém, BrazilInstituto Tecnologico Vale Desenvolvimento Sustentável, Belém, BrazilUniversidade Federal do Pará, Computer Science Graduate Program, Belém, BrazilInstituto Tecnologico Vale Desenvolvimento Sustentável, Belém, BrazilIntroductionCity-scale rainfall prediction is crucial for various essential services, such as transportation, supply chain logistics, and leisure activities, as well as for preventing risks associated with high volumes of rain. Belém is a city located in northern Brazil with distinct periods of precipitation, including a rainy season that directly impacts the city’s dynamics and the quality of life of its citizens, often resulting in flooding and infrastructure accidents in several city zones.MethodsMeteorological studies generally use large volumes of data; however, our study is characterized by using a data source with fewer years to predict rainfall precipitation. Additionally, we use meteorological data from a set of sensors installed at a meteorological station located in Belém to train multivariate statistical and machine learning (ML) models to predict precipitation. Besides the use of algorithms, another evaluation was conducted on Feature Composition based on statistical methods to investigate the impact of variables on the prediction.ResultsThe results obtained in our investigation indicate that the vector autoregressive moving average with exogenous regressors (VARMAX) model achieved the best performance in rainfall forecasting, with an average root mean square error (RMSE) of 9.1833 in time series cross-validation, outperforming the other models.DiscussionThe climate-driven patterns directly influenced the performance of the rainfall forecasting models evaluated in this study. As cited above, the VARMAX had the lowest avRMSE, which was obtained using a lag-1 value of exogenous variables. This is particularly noteworthy, as this same configuration not only produced the lowest RMSE for forecasts in 2022 but also highlighted the importance of relative humidity and solar radiation in enhancing predictive accuracy, even in the presence of data anomalies related to solar radiation measurements.https://www.frontiersin.org/articles/10.3389/feart.2025.1589753/fullmonthly precipitationstatistical learningmachine learningvariable correlationAmazon region
spellingShingle Ewerton Cristhian Lima de Oliveira
Eduardo Costa de Carvalho
Edmir dos Santos Jesus
Rafael de Lima Rocha
Rafael de Lima Rocha
Helder Moreira Arruda
Ronnie Cley de Oliveira Alves
Ronnie Cley de Oliveira Alves
Renata Gonçalves Tedeschi
A statistical and machine learning approach for monthly precipitation forecasting in an Amazon city
Frontiers in Earth Science
monthly precipitation
statistical learning
machine learning
variable correlation
Amazon region
title A statistical and machine learning approach for monthly precipitation forecasting in an Amazon city
title_full A statistical and machine learning approach for monthly precipitation forecasting in an Amazon city
title_fullStr A statistical and machine learning approach for monthly precipitation forecasting in an Amazon city
title_full_unstemmed A statistical and machine learning approach for monthly precipitation forecasting in an Amazon city
title_short A statistical and machine learning approach for monthly precipitation forecasting in an Amazon city
title_sort statistical and machine learning approach for monthly precipitation forecasting in an amazon city
topic monthly precipitation
statistical learning
machine learning
variable correlation
Amazon region
url https://www.frontiersin.org/articles/10.3389/feart.2025.1589753/full
work_keys_str_mv AT ewertoncristhianlimadeoliveira astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT eduardocostadecarvalho astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT edmirdossantosjesus astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT rafaeldelimarocha astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT rafaeldelimarocha astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT heldermoreiraarruda astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT ronniecleydeoliveiraalves astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT ronniecleydeoliveiraalves astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT renatagoncalvestedeschi astatisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT ewertoncristhianlimadeoliveira statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT eduardocostadecarvalho statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT edmirdossantosjesus statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT rafaeldelimarocha statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT rafaeldelimarocha statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT heldermoreiraarruda statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT ronniecleydeoliveiraalves statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT ronniecleydeoliveiraalves statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity
AT renatagoncalvestedeschi statisticalandmachinelearningapproachformonthlyprecipitationforecastinginanamazoncity