Prediction of Monthly Temperature Over China Based on a Machine Learning Method
Machine learning has achieved significant success in many statistical application scenarios, but has yet to be fully successful in monthly and seasonal predictions. We identified three statistical challenges in climate prediction: instability of statistical models, complexity of feature factors, and...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2025-01-01
|
| Series: | Advances in Meteorology |
| Online Access: | http://dx.doi.org/10.1155/adme/6917682 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Machine learning has achieved significant success in many statistical application scenarios, but has yet to be fully successful in monthly and seasonal predictions. We identified three statistical challenges in climate prediction: instability of statistical models, complexity of feature factors, and the nonlinearity of the relationship between predictors and predictands. These characteristics limit both traditional empirical forecasting and machine learning methods. This paper proposes a novel method called dynamically modeled machine learning to predict monthly temperature anomalies over China. The core idea of dynamic modeling is that the machine learning model is trained using a sliding time window, so that the relationship between predictors and predictands is optimized for a specific and recent period rather than for the entire time span. One hundred thirty indices related to atmospheric and oceanic circulation and other climatic events from the Beijing Climate Center are used as the feature set. After feature engineering, including feature selection and dimensionality reduction, the predictors are generated and input into a regressor. Five machine learning algorithms are employed as regressors one by one: linear regression (LR), ridge regression (RR), random forest (RF), support vector machine (SVM), and gradient boosting decision trees (GBDTs). The method performs reforecasting for 2012–2021 and compares the results with the output of operational climate models from ECMWF, NCEP, and the Beijing Climate Center. Three quantitative evaluation metrics—predictive score (PS), anomaly correlation coefficient (ACC), and anomaly sign agreement rate—were used to assess the prediction performance of each machine learning regressor, the ensemble model, and three dynamic models. The results demonstrate that the method using GBDTs as the regressor achieves the best predictive performance compared to other methods and operational models, with a monthly average PS score of 84, an ACC value of 0.27, and an anomaly sign agreement rate of 74%. |
|---|---|
| ISSN: | 1687-9317 |