A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility

The accuracy of data-driven landslide susceptibility prediction depends heavily on the quality of non-landslide samples and the selection of machine-learning algorithms. Current methods rely on artificial prior knowledge to obtain negative samples from landslide-free regions or outside the landslide...

Full description

Saved in:
Bibliographic Details
Main Authors: Yongxing Lu, Honggen Xu, Can Wang, Guanxi Yan, Zhitao Huo, Zuwu Peng, Bo Liu, Chong Xu
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/19/3663
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850284328604401664
author Yongxing Lu
Honggen Xu
Can Wang
Guanxi Yan
Zhitao Huo
Zuwu Peng
Bo Liu
Chong Xu
author_facet Yongxing Lu
Honggen Xu
Can Wang
Guanxi Yan
Zhitao Huo
Zuwu Peng
Bo Liu
Chong Xu
author_sort Yongxing Lu
collection DOAJ
description The accuracy of data-driven landslide susceptibility prediction depends heavily on the quality of non-landslide samples and the selection of machine-learning algorithms. Current methods rely on artificial prior knowledge to obtain negative samples from landslide-free regions or outside the landslide buffer zones randomly and quickly but often ignore the reliability of non-landslide samples, which will pose a serious risk of including potential landslides and lead to erroneous outcomes in training data. Furthermore, diverse machine-learning models exhibit distinct classification capabilities, and applying a single model can readily result in over-fitting of the dataset and introduce potential uncertainties in predictions. To address these problems, taking Chenxi County, a hilly and mountainous area in southern China, as an example, this research proposes a strategy-coupling optimised sampling with heterogeneous ensemble machine learning to enhance the accuracy of landslide susceptibility prediction. Initially, 21 landslide impact factors were derived from five aspects: geology, hydrology, topography, meteorology, human activities, and geographical environment. Then, these factors were screened through a correlation analysis and collinearity diagnosis. Afterwards, an optimised sampling (OS) method was utilised to select negative samples by fusing the reliability of non-landslide samples and certainty factor values on the basis of the environmental similarity and statistical model. Subsequently, the adopted non-landslide samples and historical landslides were combined to create machine-learning datasets. Finally, baseline models (support vector machine, random forest, and back propagation neural network) and the stacking ensemble model were employed to predict susceptibility. The findings indicated that the OS method, considering the reliability of non-landslide samples, achieved higher-quality negative samples than currently widely used sampling methods. The stacking ensemble machine-learning model outperformed those three baseline models. Notably, the accuracy of the hybrid OS–Stacking model is most promising, up to 97.1%. The integrated strategy significantly improves the prediction of landslide susceptibility and makes it reliable and effective for assessing regional geohazard risk.
format Article
id doaj-art-e42b87f180984c7ab618bde35ded4928
institution OA Journals
issn 2072-4292
language English
publishDate 2024-10-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-e42b87f180984c7ab618bde35ded49282025-08-20T01:47:36ZengMDPI AGRemote Sensing2072-42922024-10-011619366310.3390/rs16193663A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide SusceptibilityYongxing Lu0Honggen Xu1Can Wang2Guanxi Yan3Zhitao Huo4Zuwu Peng5Bo Liu6Chong Xu7Changsha General Survey of Natural Resources Center, China Geological Survey, Changsha 410600, ChinaChangsha General Survey of Natural Resources Center, China Geological Survey, Changsha 410600, ChinaHunan Institute of Geological Disaster Investigation and Monitoring, Changsha 410004, ChinaSchool of Civil Engineering, University of Queensland, St. Lucia, QLD 4067, AustraliaChangsha General Survey of Natural Resources Center, China Geological Survey, Changsha 410600, ChinaGeological Survey Institute of Hunan Province, Changsha 410014, ChinaSchool of Civil Engineering, University of Queensland, St. Lucia, QLD 4067, AustraliaNational Institute of Natural Hazards, Ministry of Emergency Management of China, Beijing 100085, ChinaThe accuracy of data-driven landslide susceptibility prediction depends heavily on the quality of non-landslide samples and the selection of machine-learning algorithms. Current methods rely on artificial prior knowledge to obtain negative samples from landslide-free regions or outside the landslide buffer zones randomly and quickly but often ignore the reliability of non-landslide samples, which will pose a serious risk of including potential landslides and lead to erroneous outcomes in training data. Furthermore, diverse machine-learning models exhibit distinct classification capabilities, and applying a single model can readily result in over-fitting of the dataset and introduce potential uncertainties in predictions. To address these problems, taking Chenxi County, a hilly and mountainous area in southern China, as an example, this research proposes a strategy-coupling optimised sampling with heterogeneous ensemble machine learning to enhance the accuracy of landslide susceptibility prediction. Initially, 21 landslide impact factors were derived from five aspects: geology, hydrology, topography, meteorology, human activities, and geographical environment. Then, these factors were screened through a correlation analysis and collinearity diagnosis. Afterwards, an optimised sampling (OS) method was utilised to select negative samples by fusing the reliability of non-landslide samples and certainty factor values on the basis of the environmental similarity and statistical model. Subsequently, the adopted non-landslide samples and historical landslides were combined to create machine-learning datasets. Finally, baseline models (support vector machine, random forest, and back propagation neural network) and the stacking ensemble model were employed to predict susceptibility. The findings indicated that the OS method, considering the reliability of non-landslide samples, achieved higher-quality negative samples than currently widely used sampling methods. The stacking ensemble machine-learning model outperformed those three baseline models. Notably, the accuracy of the hybrid OS–Stacking model is most promising, up to 97.1%. The integrated strategy significantly improves the prediction of landslide susceptibility and makes it reliable and effective for assessing regional geohazard risk.https://www.mdpi.com/2072-4292/16/19/3663landslide susceptibility predictionoptimised samplingstacking ensemble machine-learning algorithmreliability of non-landslide samples
spellingShingle Yongxing Lu
Honggen Xu
Can Wang
Guanxi Yan
Zhitao Huo
Zuwu Peng
Bo Liu
Chong Xu
A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility
Remote Sensing
landslide susceptibility prediction
optimised sampling
stacking ensemble machine-learning algorithm
reliability of non-landslide samples
title A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility
title_full A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility
title_fullStr A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility
title_full_unstemmed A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility
title_short A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility
title_sort novel strategy coupling optimised sampling with heterogeneous ensemble machine learning to predict landslide susceptibility
topic landslide susceptibility prediction
optimised sampling
stacking ensemble machine-learning algorithm
reliability of non-landslide samples
url https://www.mdpi.com/2072-4292/16/19/3663
work_keys_str_mv AT yongxinglu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT honggenxu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT canwang anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT guanxiyan anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT zhitaohuo anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT zuwupeng anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT boliu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT chongxu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT yongxinglu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT honggenxu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT canwang novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT guanxiyan novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT zhitaohuo novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT zuwupeng novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT boliu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility
AT chongxu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility