Landslide and Collapse Susceptibility Analysis in Wenchuan Earthquake-damaged Area Based on Ensemble Learning Methods

ObjectiveThe 5·12 Wenchuan earthquake triggered extensive secondary geological disasters and cascading effects. Wenchuan County, which was severely impacted by the earthquake, exhibits widespread unstable slopes and areas prone to landslides and collapses. In mountainous regions, the occurrence of e...

Full description

Saved in:
Bibliographic Details
Main Authors: DING Jiawei, WANG Xiekang
Format: Article
Language:English
Published: Editorial Department of Journal of Sichuan University (Engineering Science Edition) 2025-07-01
Series:工程科学与技术
Subjects:
Online Access:http://jsuese.scu.edu.cn/thesisDetails#10.12454/j.jsuese.202400244
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ObjectiveThe 5·12 Wenchuan earthquake triggered extensive secondary geological disasters and cascading effects. Wenchuan County, which was severely impacted by the earthquake, exhibits widespread unstable slopes and areas prone to landslides and collapses. In mountainous regions, the occurrence of extreme rainfall events precipitates extensive landslides and collapses. The copious loose material produced constitutes a substantial sediment source, exacerbating the magnitude of flash flood disasters under the coupling effect of water and sediment movement, and particularly heightening the risk of debris flows and debris floods. Given these circumstances, it is imperative to develop assessment models for landslide and collapse susceptibility to facilitate early prevention of compound flash flood disasters in Wenchuan County. Conventional susceptibility assessment approaches often rely on expert experience and subjective judgment; alternatively, they encounter difficulties in adequately fitting high-dimensional complex data. As a result, the precise delineation of the actual spatial distribution of areas susceptible to landslides and collapses remains a formidable challenge. Recent advancements in data science and machine learning provide promising solutions. Two state-of-the-art ensemble learning algorithms, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM), are introduced to formulate dependable models for appraising susceptibility to landslides and collapses within the confines of Wenchuan County.MethodsA comprehensive evaluation of factors related to topography, geology, meteorology, and hydrology was conducted to select ten evaluative factors: Elevation, slope, aspect, terrain relief, distance to rivers, distance to faults, normalized difference vegetation index (NDVI), land cover type, average annual precipitation, and lithology. Data preprocessing procedures were implemented to ensure the effectiveness and stability of model training. The data were standardized to mitigate the impact of differing scales among the dependent factors on the model. Factors displaying significant multicollinearity were identified and excluded using the Variance Inflation Factor (VIF), ensuring the independence of each feature in the analysis. In addition, the Information Gain Ratio (InGR) was utilized as a metric to evaluate the importance of each factor, facilitating the preliminary selection of explanatory variables. Then, two advanced ensemble learning algorithms (XGBoost and LightGBM) were applied alongside two traditional algorithms (logistic regression and random forest) to construct landslide and collapse susceptibility assessment models for Wenchuan County. Quantitative metrics, including accuracy, precision, recall, F<sub>1</sub> score, and receiver operating characteristic (ROC) curves, were employed to enable a comparative and evaluative analysis of the performance of each model. These models were then utilized to predict the probabilities of landslide and collapse occurrences across the designated study area. The natural breakpoint method was employed to demarcate susceptibility zones, resulting in the development of a map delineating areas vulnerable to landslides and collapses. Additional qualitative and quantitative analyses were performed on the resulting susceptibility maps, with particular attention given to the correspondence between predicted results and actual landslide and collapse events, evaluating the predictive reliability of the proposed models.Results and DiscussionsThe results indicated that both ensemble learning models demonstrated superior classification prediction capabilities when compared to traditional models. XGBoost and LightGBM achieved accuracies of 0.903, surpassing random forest (0.900) and logistic regression (0.864). In terms of precision, LightGBM (0.887) slightly outperformed XGBoost (0.882), while both outperformed random forest (0.872) and logistic regression (0.802). The F<sub>1</sub> score metric placed XGBoost at the forefront with 0.899, closely followed by LightGBM (0.898) and random forest (0.897), while logistic regression yielded the lowest F<sub>1</sub> score (0.866). Evaluation of the area under the ROC curve (AUC) indicated that XGBoost and LightGBM achieved nearly identical high classification performance (0.904), outperforming random forest (0.902), with logistic regression trailing at the lowest AUC (0.869). The examination of the constructed susceptibility zoning maps, coupled with quantitative analysis of the area proportions attributed to each zone, disclosed disparities in the partitioning outcomes from the XGBoost and LightGBM models in comparison to those produced by logistic regression and random forest models. These disparities were primarily attributed to the divergent data processing strategies inherent to each algorithm. In an effort to substantiate the reliability of the models’ predictions, the density of landslide and collapse points within each susceptibility zone was quantitatively scrutinized. XGBoost, LightGBM, and random forest models consistently reflected the general trend of increasing landslide and collapse point density with higher susceptibility levels, aligning with the typical pattern of disaster susceptibility. LightGBM performed best in identifying high and extremely high susceptibility areas, with landslide and collapse point density ratios of 1.844 and 3.079, respectively, the highest among all models evaluated. In contrast, logistic regression did not adhere to this increasing trend, presenting an anomalous ratio of 0.588 in zones of very low susceptibility, a figure surpassing that within zones of high susceptibility (0.528). This anomaly indicated the presence of prediction bias in the logistic regression model, potentially ascribable to the limitations of the logistic regression algorithm and the lack of representative data.ConclusionsThe predictive capabilities of the advanced ensemble learning models in assessing landslide and collapse susceptibility in Wenchuan County surpassed those of the two traditional models. These models outperformed the traditional approaches in terms of accuracy, precision, F<sub>1</sub> score, and area under the Receiver Operating Characteristic. LightGBM demonstrated higher precision, while XGBoost yielded superior results in the F<sub>1</sub> score. In terms of reliability, both ensemble learning models, particularly LightGBM, exhibited advantages in identifying high and very high susceptibility areas, reinforcing their superiority in landslide and collapse susceptibility assessment. The research findings provide a more accurate tool for evaluating landslide and collapse susceptibility in Wenchuan County and similar areas affected by earthquakes, supporting the development of disaster prevention and mitigation measures. Future research can involve more comprehensive data collection methods and investigate broader applications of ensemble learning models, improving the reliability and practical implementation of predictions in disaster management.
ISSN:2096-3246