Optimizing Air Pollution Forecasting Across Temporal Scales: A Case Study in Salamanca, Mexico

Air pollution forecasting is essential for understanding environmental patterns and mitigating health risks, especially in urban areas. This study investigates the forecasting of criterion pollutants—<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="in...

Full description

Saved in:
Bibliographic Details
Main Authors: Francisco-Javier Moreno-Vazquez, Felipe Trujillo-Romero, Amanda Enriqueta Violante Gavira
Format: Article
Language:English
Published: MDPI AG 2025-02-01
Series:Earth
Subjects:
Online Access:https://www.mdpi.com/2673-4834/6/1/9
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Air pollution forecasting is essential for understanding environmental patterns and mitigating health risks, especially in urban areas. This study investigates the forecasting of criterion pollutants—<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>C</mi><mi>O</mi><mo>,</mo><msub><mi>O</mi><mn>3</mn></msub><mo>,</mo><mi>S</mi><msub><mi>O</mi><mn>2</mn></msub><mo>,</mo><mi>N</mi><msub><mi>O</mi><mn>2</mn></msub><mo>,</mo><mi>P</mi><msub><mi>M</mi><mrow><mn>2.5</mn></mrow></msub><mo>,</mo></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>P</mi><msub><mi>M</mi><mn>10</mn></msub></mrow></semantics></math></inline-formula>—across multiple temporal frames (hourly, daily, weekly, monthly) in Salamanca, Mexico, utilizing temporal, meteorological, and pollutant data from local monitoring stations. The primary objective is to identify robust models capable of short- and mid-term predictions, despite challenges related to data inconsistencies and missing values. Leveraging the low-code PyCaret framework, a benchmark analysis was conducted to identify the best-performing models for each pollutant. Statistical evaluations, including ANOVA and Tukey HSD tests, were employed to compare model performance across different time frames. The results reveal significant variations in prediction accuracy depending on both the pollutant and temporal windows, with stronger predictive performance observed in the weekly and monthly frames. The research indicates that the incorporation of temporal and environmental variables enhances forecast accuracy and highlights the value of low-code AutoML tools, such as PyCaret, in streamlining model selection and improving overall forecasting efficiency.
ISSN:2673-4834