A general methodological framework for predicting and assessing heavy metal pollution in paddy soils using machine learning models

Heavy metal contamination in paddies poses a serious threat to ecological and human health. Current researches about heavy metal pollution mainly focus on source apportionment, while robust and accurate predictions on its spatial distribution and driving mechanisms is still lacking. Herein, we devel...

Full description

Saved in:
Bibliographic Details
Main Authors: Unurnyam Jugnee, Le Jiao, Sainbayar Dalantai, Lili Huo, Yi An, Bayartungalag Batsaikhan, Undrakhtsetseg Tsogtbaatar, Munguntuul Ulziibaatar, Boldbaatar Natsagdorj
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844025009995
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Heavy metal contamination in paddies poses a serious threat to ecological and human health. Current researches about heavy metal pollution mainly focus on source apportionment, while robust and accurate predictions on its spatial distribution and driving mechanisms is still lacking. Herein, we developed a general methodological framework to predict and assess heavy metal pollution of paddies in Hunan province, China, by employing Random Forest (RF), Extra Trees Regressor (ETR), Extreme Gradient Boost Regression (XGBR), and Gradient Boosting Regression Tree (GBRT). Results demonstrated that RF performed superiorly in predicting As (R2 = 0.706), Cr (R2 = 0.746), Cu (R2 = 0.705), and Hg (R2 = 0.73), while the ETR showed good performance in predicting Cd (R2 = 0.521), Zn (R2 = 0.404), and Pb (R2 = 0.625). GBRT performed well in predicting Ni (R2 = 0.61). The Shapley additive explanations suggested significant differences in the driving factors and their contributions to prediction models for each heavy metal. Climate variables were potentially valuable predictors of heavy metal content. The visualized spatial distribution of Pollution Load Index showed that 79.8 % of the study area was moderately polluted and the remaining 20.2 % was in a severe polluted state. The soil pollution state worsened from the west to the east of the study area. These findings provide valuable information on effective soil pollution control and soil conservation efforts.
ISSN:2405-8440