Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning

Accurate air pollutant prediction is essential for addressing environmental and public health concerns. Air quality models like WRF-CMAQ provide simulations, but often show significant errors compared to observed concentrations. To identify the sources of these model biases, we applied the XGBoost m...

Full description

Saved in:
Bibliographic Details
Main Authors: Chunying Fan, Ruilin Wang, Ge Song, Mengfan Teng, Maolin Zhang, Huangchuan Liu, Zhujun Li, Siwei Li, Jia Xing
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Atmosphere
Subjects:
Online Access:https://www.mdpi.com/2073-4433/15/11/1337
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846154370956132352
author Chunying Fan
Ruilin Wang
Ge Song
Mengfan Teng
Maolin Zhang
Huangchuan Liu
Zhujun Li
Siwei Li
Jia Xing
author_facet Chunying Fan
Ruilin Wang
Ge Song
Mengfan Teng
Maolin Zhang
Huangchuan Liu
Zhujun Li
Siwei Li
Jia Xing
author_sort Chunying Fan
collection DOAJ
description Accurate air pollutant prediction is essential for addressing environmental and public health concerns. Air quality models like WRF-CMAQ provide simulations, but often show significant errors compared to observed concentrations. To identify the sources of these model biases, we applied the XGBoost machine learning algorithm to assess the performance of WRF-CMAQ in predicting air pollutants across two regions in China. XGBoost models trained with observations achieved high accuracy (<i>R</i> > 0.95), indicating that the selected features effectively capture pollutant variations. When trained on WRF-CMAQ inputs, XGBoost still improved performance but revealed biases linked to both model inputs (10–60%) and mechanisms (1–30%). Analysis identified previous-hour pollutant levels as the largest bias contributor, followed by meteorological variables. The study highlights the need for improving both model inputs and mechanisms to enhance future air quality predictions and support pollution control strategies.
format Article
id doaj-art-0f1c7c94c4cb46b9b1167c1f36480c0b
institution Kabale University
issn 2073-4433
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Atmosphere
spelling doaj-art-0f1c7c94c4cb46b9b1167c1f36480c0b2024-11-26T17:50:25ZengMDPI AGAtmosphere2073-44332024-11-011511133710.3390/atmos15111337Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine LearningChunying Fan0Ruilin Wang1Ge Song2Mengfan Teng3Maolin Zhang4Huangchuan Liu5Zhujun Li6Siwei Li7Jia Xing8Hubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaInstitute of Software, Chinese Academy of Sciences, Beijing 100864, ChinaHubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaHubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaHubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaHubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaHubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaHubei Key Laboratory of Quantitative Remote Sensing of Land and Atmosphere, School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaDepartment of Civil and Environmental Engineering, The University of Tennessee, Knoxville, TN 37996, USAAccurate air pollutant prediction is essential for addressing environmental and public health concerns. Air quality models like WRF-CMAQ provide simulations, but often show significant errors compared to observed concentrations. To identify the sources of these model biases, we applied the XGBoost machine learning algorithm to assess the performance of WRF-CMAQ in predicting air pollutants across two regions in China. XGBoost models trained with observations achieved high accuracy (<i>R</i> > 0.95), indicating that the selected features effectively capture pollutant variations. When trained on WRF-CMAQ inputs, XGBoost still improved performance but revealed biases linked to both model inputs (10–60%) and mechanisms (1–30%). Analysis identified previous-hour pollutant levels as the largest bias contributor, followed by meteorological variables. The study highlights the need for improving both model inputs and mechanisms to enhance future air quality predictions and support pollution control strategies.https://www.mdpi.com/2073-4433/15/11/1337air qualitysimulationbiasmachine learningprediction
spellingShingle Chunying Fan
Ruilin Wang
Ge Song
Mengfan Teng
Maolin Zhang
Huangchuan Liu
Zhujun Li
Siwei Li
Jia Xing
Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning
Atmosphere
air quality
simulation
bias
machine learning
prediction
title Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning
title_full Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning
title_fullStr Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning
title_full_unstemmed Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning
title_short Quantifying the Impact of Multiple Factors on Air Quality Model Simulation Biases Using Machine Learning
title_sort quantifying the impact of multiple factors on air quality model simulation biases using machine learning
topic air quality
simulation
bias
machine learning
prediction
url https://www.mdpi.com/2073-4433/15/11/1337
work_keys_str_mv AT chunyingfan quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT ruilinwang quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT gesong quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT mengfanteng quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT maolinzhang quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT huangchuanliu quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT zhujunli quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT siweili quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning
AT jiaxing quantifyingtheimpactofmultiplefactorsonairqualitymodelsimulationbiasesusingmachinelearning