A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility
The accuracy of data-driven landslide susceptibility prediction depends heavily on the quality of non-landslide samples and the selection of machine-learning algorithms. Current methods rely on artificial prior knowledge to obtain negative samples from landslide-free regions or outside the landslide...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-10-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/16/19/3663 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850284328604401664 |
|---|---|
| author | Yongxing Lu Honggen Xu Can Wang Guanxi Yan Zhitao Huo Zuwu Peng Bo Liu Chong Xu |
| author_facet | Yongxing Lu Honggen Xu Can Wang Guanxi Yan Zhitao Huo Zuwu Peng Bo Liu Chong Xu |
| author_sort | Yongxing Lu |
| collection | DOAJ |
| description | The accuracy of data-driven landslide susceptibility prediction depends heavily on the quality of non-landslide samples and the selection of machine-learning algorithms. Current methods rely on artificial prior knowledge to obtain negative samples from landslide-free regions or outside the landslide buffer zones randomly and quickly but often ignore the reliability of non-landslide samples, which will pose a serious risk of including potential landslides and lead to erroneous outcomes in training data. Furthermore, diverse machine-learning models exhibit distinct classification capabilities, and applying a single model can readily result in over-fitting of the dataset and introduce potential uncertainties in predictions. To address these problems, taking Chenxi County, a hilly and mountainous area in southern China, as an example, this research proposes a strategy-coupling optimised sampling with heterogeneous ensemble machine learning to enhance the accuracy of landslide susceptibility prediction. Initially, 21 landslide impact factors were derived from five aspects: geology, hydrology, topography, meteorology, human activities, and geographical environment. Then, these factors were screened through a correlation analysis and collinearity diagnosis. Afterwards, an optimised sampling (OS) method was utilised to select negative samples by fusing the reliability of non-landslide samples and certainty factor values on the basis of the environmental similarity and statistical model. Subsequently, the adopted non-landslide samples and historical landslides were combined to create machine-learning datasets. Finally, baseline models (support vector machine, random forest, and back propagation neural network) and the stacking ensemble model were employed to predict susceptibility. The findings indicated that the OS method, considering the reliability of non-landslide samples, achieved higher-quality negative samples than currently widely used sampling methods. The stacking ensemble machine-learning model outperformed those three baseline models. Notably, the accuracy of the hybrid OS–Stacking model is most promising, up to 97.1%. The integrated strategy significantly improves the prediction of landslide susceptibility and makes it reliable and effective for assessing regional geohazard risk. |
| format | Article |
| id | doaj-art-e42b87f180984c7ab618bde35ded4928 |
| institution | OA Journals |
| issn | 2072-4292 |
| language | English |
| publishDate | 2024-10-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Remote Sensing |
| spelling | doaj-art-e42b87f180984c7ab618bde35ded49282025-08-20T01:47:36ZengMDPI AGRemote Sensing2072-42922024-10-011619366310.3390/rs16193663A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide SusceptibilityYongxing Lu0Honggen Xu1Can Wang2Guanxi Yan3Zhitao Huo4Zuwu Peng5Bo Liu6Chong Xu7Changsha General Survey of Natural Resources Center, China Geological Survey, Changsha 410600, ChinaChangsha General Survey of Natural Resources Center, China Geological Survey, Changsha 410600, ChinaHunan Institute of Geological Disaster Investigation and Monitoring, Changsha 410004, ChinaSchool of Civil Engineering, University of Queensland, St. Lucia, QLD 4067, AustraliaChangsha General Survey of Natural Resources Center, China Geological Survey, Changsha 410600, ChinaGeological Survey Institute of Hunan Province, Changsha 410014, ChinaSchool of Civil Engineering, University of Queensland, St. Lucia, QLD 4067, AustraliaNational Institute of Natural Hazards, Ministry of Emergency Management of China, Beijing 100085, ChinaThe accuracy of data-driven landslide susceptibility prediction depends heavily on the quality of non-landslide samples and the selection of machine-learning algorithms. Current methods rely on artificial prior knowledge to obtain negative samples from landslide-free regions or outside the landslide buffer zones randomly and quickly but often ignore the reliability of non-landslide samples, which will pose a serious risk of including potential landslides and lead to erroneous outcomes in training data. Furthermore, diverse machine-learning models exhibit distinct classification capabilities, and applying a single model can readily result in over-fitting of the dataset and introduce potential uncertainties in predictions. To address these problems, taking Chenxi County, a hilly and mountainous area in southern China, as an example, this research proposes a strategy-coupling optimised sampling with heterogeneous ensemble machine learning to enhance the accuracy of landslide susceptibility prediction. Initially, 21 landslide impact factors were derived from five aspects: geology, hydrology, topography, meteorology, human activities, and geographical environment. Then, these factors were screened through a correlation analysis and collinearity diagnosis. Afterwards, an optimised sampling (OS) method was utilised to select negative samples by fusing the reliability of non-landslide samples and certainty factor values on the basis of the environmental similarity and statistical model. Subsequently, the adopted non-landslide samples and historical landslides were combined to create machine-learning datasets. Finally, baseline models (support vector machine, random forest, and back propagation neural network) and the stacking ensemble model were employed to predict susceptibility. The findings indicated that the OS method, considering the reliability of non-landslide samples, achieved higher-quality negative samples than currently widely used sampling methods. The stacking ensemble machine-learning model outperformed those three baseline models. Notably, the accuracy of the hybrid OS–Stacking model is most promising, up to 97.1%. The integrated strategy significantly improves the prediction of landslide susceptibility and makes it reliable and effective for assessing regional geohazard risk.https://www.mdpi.com/2072-4292/16/19/3663landslide susceptibility predictionoptimised samplingstacking ensemble machine-learning algorithmreliability of non-landslide samples |
| spellingShingle | Yongxing Lu Honggen Xu Can Wang Guanxi Yan Zhitao Huo Zuwu Peng Bo Liu Chong Xu A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility Remote Sensing landslide susceptibility prediction optimised sampling stacking ensemble machine-learning algorithm reliability of non-landslide samples |
| title | A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility |
| title_full | A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility |
| title_fullStr | A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility |
| title_full_unstemmed | A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility |
| title_short | A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility |
| title_sort | novel strategy coupling optimised sampling with heterogeneous ensemble machine learning to predict landslide susceptibility |
| topic | landslide susceptibility prediction optimised sampling stacking ensemble machine-learning algorithm reliability of non-landslide samples |
| url | https://www.mdpi.com/2072-4292/16/19/3663 |
| work_keys_str_mv | AT yongxinglu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT honggenxu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT canwang anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT guanxiyan anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT zhitaohuo anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT zuwupeng anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT boliu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT chongxu anovelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT yongxinglu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT honggenxu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT canwang novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT guanxiyan novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT zhitaohuo novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT zuwupeng novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT boliu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility AT chongxu novelstrategycouplingoptimisedsamplingwithheterogeneousensemblemachinelearningtopredictlandslidesusceptibility |