Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector
Identifying and quantifying the dominant factors influencing heavy metal (HM) pollution sources are essential for maintaining soil ecological health and implementing effective pollution control measures. This study analyzed soil HM samples from 53 different land use types in Jiaozuo City, Henan Prov...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-01-01
|
Series: | Ecotoxicology and Environmental Safety |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S0147651325000673 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823856941944799232 |
---|---|
author | Tong Liu Mingshi Wang Mingya Wang Qinqing Xiong Luhao Jia Wanqi Ma Shaobo Sui Wei Wu Xiaoming Guo |
author_facet | Tong Liu Mingshi Wang Mingya Wang Qinqing Xiong Luhao Jia Wanqi Ma Shaobo Sui Wei Wu Xiaoming Guo |
author_sort | Tong Liu |
collection | DOAJ |
description | Identifying and quantifying the dominant factors influencing heavy metal (HM) pollution sources are essential for maintaining soil ecological health and implementing effective pollution control measures. This study analyzed soil HM samples from 53 different land use types in Jiaozuo City, Henan Province, China. Pollution sources were identified using Absolute Principal Component Score (APCS), with 8 anthropogenic factors, 9 natural factors, and 4 soil physicochemical properties mapped using Geographic Information System (GIS) kernel density estimation. Geodetector and a genetic algorithm optimized random forest model (GA-RF) were employed to quantify the dominant factors and precisely identify pollution sources. A Monte Carlo model was further applied to assess source-oriented health risk probabilities across age groups in the study area. The results revealed three principal components representing pollution sources, with contribution rates of 47.2 %, 33.3 %, and 19.5 %, respectively. For pollution source 1, industrial activities were dominant, with factory density (27.7 %) and distance from the factory (36.3 %) identified as the main factors. Cr, Cu, Mn, and Ni had high loads in this source. Pollution source 2, a combination of natural and transportation influences, was primarily affected by the normalized difference vegetation index (NDVI, 37.8 %), road network density (16.8 %), and proximity to roads (15.3 %). Pollution source 3 was linked to agricultural activities, with cultivated land density (CLD) contributing 39.1 %. As exhibited a high load (91.1 %) in this source, with an exceedance rate of 93 % in cultivated soil, a moderate enrichment factor of 2.33, and a strong ecological risk index of 615.72, making it the most polluted metal in the area. The source-oriented Health Risk Assessment (HRA) showed that agricultural activities contributed 88.7 % to the carcinogenic risk from As in cultivated land. Overall, 99.3 % of the population faced an acceptable cancer risk level. Unlike traditional source apportionment methods, the GA-RF model effectively quantified the contributions of specific influencing factors (e.g., factory density) to pollution sources, rather than merely estimating the percentage contributions of the sources themselves. This approach provides a novel perspective for HM source apportionment under complex environmental conditions. |
format | Article |
id | doaj-art-dd4edc0e059e49febb4b0e154e52e15e |
institution | Kabale University |
issn | 0147-6513 |
language | English |
publishDate | 2025-01-01 |
publisher | Elsevier |
record_format | Article |
series | Ecotoxicology and Environmental Safety |
spelling | doaj-art-dd4edc0e059e49febb4b0e154e52e15e2025-02-12T05:30:07ZengElsevierEcotoxicology and Environmental Safety0147-65132025-01-01290117731Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetectorTong Liu0Mingshi Wang1Mingya Wang2Qinqing Xiong3Luhao Jia4Wanqi Ma5Shaobo Sui6Wei Wu7Xiaoming Guo8College of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, ChinaCollege of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, China; Corresponding author.College of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, ChinaCollege of Atmospheric Physics, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaCollege of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, ChinaCollege of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, ChinaCollege of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, ChinaCollege of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, ChinaCollege of Resource and Environment, Henan Polytechnic University, Jiaozuo 454003, ChinaIdentifying and quantifying the dominant factors influencing heavy metal (HM) pollution sources are essential for maintaining soil ecological health and implementing effective pollution control measures. This study analyzed soil HM samples from 53 different land use types in Jiaozuo City, Henan Province, China. Pollution sources were identified using Absolute Principal Component Score (APCS), with 8 anthropogenic factors, 9 natural factors, and 4 soil physicochemical properties mapped using Geographic Information System (GIS) kernel density estimation. Geodetector and a genetic algorithm optimized random forest model (GA-RF) were employed to quantify the dominant factors and precisely identify pollution sources. A Monte Carlo model was further applied to assess source-oriented health risk probabilities across age groups in the study area. The results revealed three principal components representing pollution sources, with contribution rates of 47.2 %, 33.3 %, and 19.5 %, respectively. For pollution source 1, industrial activities were dominant, with factory density (27.7 %) and distance from the factory (36.3 %) identified as the main factors. Cr, Cu, Mn, and Ni had high loads in this source. Pollution source 2, a combination of natural and transportation influences, was primarily affected by the normalized difference vegetation index (NDVI, 37.8 %), road network density (16.8 %), and proximity to roads (15.3 %). Pollution source 3 was linked to agricultural activities, with cultivated land density (CLD) contributing 39.1 %. As exhibited a high load (91.1 %) in this source, with an exceedance rate of 93 % in cultivated soil, a moderate enrichment factor of 2.33, and a strong ecological risk index of 615.72, making it the most polluted metal in the area. The source-oriented Health Risk Assessment (HRA) showed that agricultural activities contributed 88.7 % to the carcinogenic risk from As in cultivated land. Overall, 99.3 % of the population faced an acceptable cancer risk level. Unlike traditional source apportionment methods, the GA-RF model effectively quantified the contributions of specific influencing factors (e.g., factory density) to pollution sources, rather than merely estimating the percentage contributions of the sources themselves. This approach provides a novel perspective for HM source apportionment under complex environmental conditions.http://www.sciencedirect.com/science/article/pii/S0147651325000673APCS-MLRGA-RFGeodetectorHeavy metalsSoil |
spellingShingle | Tong Liu Mingshi Wang Mingya Wang Qinqing Xiong Luhao Jia Wanqi Ma Shaobo Sui Wei Wu Xiaoming Guo Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector Ecotoxicology and Environmental Safety APCS-MLR GA-RF Geodetector Heavy metals Soil |
title | Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector |
title_full | Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector |
title_fullStr | Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector |
title_full_unstemmed | Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector |
title_short | Identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector |
title_sort | identification of the primary pollution sources and dominant influencing factors of soil heavy metals using a random forest model optimized by genetic algorithm coupled with geodetector |
topic | APCS-MLR GA-RF Geodetector Heavy metals Soil |
url | http://www.sciencedirect.com/science/article/pii/S0147651325000673 |
work_keys_str_mv | AT tongliu identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT mingshiwang identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT mingyawang identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT qinqingxiong identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT luhaojia identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT wanqima identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT shaobosui identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT weiwu identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector AT xiaomingguo identificationoftheprimarypollutionsourcesanddominantinfluencingfactorsofsoilheavymetalsusingarandomforestmodeloptimizedbygeneticalgorithmcoupledwithgeodetector |