An Adaptive Graph Convolutional Network with Spatial Autocorrelation for Enhancing 3D Soil Pollutant Mapping Precision from Sparse Borehole Data

Sparse borehole sampling at contaminated sites results in sparse and unevenly distributed data on soil pollutants. Traditional interpolation methods may obscure local variations in soil contamination when applied to such sparse data, thus reducing the interpolation accuracy. We propose an adaptive g...

Full description

Saved in:
Bibliographic Details
Main Authors: Huan Tao, Ziyang Li, Shengdong Nie, Hengkai Li, Dan Zhao
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Land
Subjects:
Online Access:https://www.mdpi.com/2073-445X/14/7/1348
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sparse borehole sampling at contaminated sites results in sparse and unevenly distributed data on soil pollutants. Traditional interpolation methods may obscure local variations in soil contamination when applied to such sparse data, thus reducing the interpolation accuracy. We propose an adaptive graph convolutional network with spatial autocorrelation (ASI-GCN) model to overcome this challenge. The ASI-GCN model effectively constrains pollutant concentration transfer while capturing subtle spatial variations, improving soil pollution characterization accuracy. We tested our model at a coking plant using 215 soil samples from 15 boreholes, evaluating its robustness with three pollutants of varying volatility: arsenic (As, non-volatile), benzo(a)pyrene (BaP, semi-volatile), and benzene (Ben, volatile). Leave-one-out cross-validation demonstrates that the ASI-GCN_RC_G model (ASI-GCN with residual connections) achieves the highest prediction accuracy. Specifically, the <i>R</i> for As, BaP, and Ben are 0.728, 0.825, and 0.781, respectively, outperforming traditional models by 58.8% (vs. IDW), 45.82% (vs. OK), and 53.78% (vs. IDW). Meanwhile, their RMSE drop by 36.56% (vs. Bayesian_K), 38.02% (vs. Bayesian_K), and 35.96% (vs. IDW), further confirming the model’s superior precision. Beyond accuracy, Monte Carlo uncertainty analysis reveals that most predicted areas exhibit low uncertainty, with only a few high-pollution hotspots exhibiting relatively high uncertainty. Further analysis revealed the significant influence of pollutant volatility on vertical migration patterns. Non-volatile As was primarily distributed in the fill and silty sand layers, and semi-volatile BaP concentrated in the silty sand layer. At the same time, volatile Ben was predominantly found in the clay and fine sand layers. By integrating spatial autocorrelation with deep graph representation, ASI-GCN redefines sparse data 3D mapping, offering a transformative tool for precise environmental governance and human health assessment.
ISSN:2073-445X