Tunnel squeezing prediction based on partially missing dataset and optimized machine learning models

Accurate prediction of tunnel squeezing, one of the common geological hazards during tunnel construction, is of great significance for ensuring construction safety and reducing economic losses. To achieve precise prediction of tunnel squeezing, this study constructed six reliable machine learning (M...

Full description

Saved in:
Bibliographic Details
Main Authors: Peng Guan, Guangzhao Ou, Feng Liang, Weibang Luo, Qingyong Wang, Chengyuan Pei, Xuan Che
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Earth Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/feart.2025.1511413/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832582903602610176
author Peng Guan
Guangzhao Ou
Feng Liang
Weibang Luo
Qingyong Wang
Chengyuan Pei
Xuan Che
author_facet Peng Guan
Guangzhao Ou
Feng Liang
Weibang Luo
Qingyong Wang
Chengyuan Pei
Xuan Che
author_sort Peng Guan
collection DOAJ
description Accurate prediction of tunnel squeezing, one of the common geological hazards during tunnel construction, is of great significance for ensuring construction safety and reducing economic losses. To achieve precise prediction of tunnel squeezing, this study constructed six reliable machine learning (ML) classification models for this purpose, including Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and K-Nearest Neighbors (KNN). The parameters of these 6 ML models were optimized using the Whale Optimization Algorithm (WOA) in conjunction with five-fold cross-validation. A total of 305 tunnel squeezing sample data were collected to train and test the models. KNN and Synthetic Minority Over-sampling Technique (SMOTE) methods were employed to handle the missing and imbalanced data sets. An input feature system for tunnel squeezing prediction was established, comprising tunnel burial depth (H), tunnel diameter (D), strength-to-stress ratio (SSR), and support stiffness (K). The XGBoost model optimized with WOA demonstrated the highest prediction accuracy of 0.9681. The SHAP method was utilized to interpret the XGBoost model, indicating that the contribution rank of the input features to tunnel squeezing prediction was SSR > K > D > H, with average SHAP values of 2.93, 1.49, 0.82, and 0.69, respectively. The XGBoost model was applied to predict tunnel squeezing in 10 sections of the Qinghai Huzhu Beishan Tunnel. The prediction results were highly consistent with the actual outcomes.
format Article
id doaj-art-6325bed7898945469164913b96e5f904
institution Kabale University
issn 2296-6463
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Earth Science
spelling doaj-art-6325bed7898945469164913b96e5f9042025-01-29T06:45:47ZengFrontiers Media S.A.Frontiers in Earth Science2296-64632025-01-011310.3389/feart.2025.15114131511413Tunnel squeezing prediction based on partially missing dataset and optimized machine learning modelsPeng Guan0Guangzhao Ou1Feng Liang2Weibang Luo3Qingyong Wang4Chengyuan Pei5Xuan Che6Faculty of Engineering, China University of Geosciences, Wuhan, ChinaSchool of Engineering Management, Hunan University of Finance and Economics, Changsha, ChinaEngineering Economics and Immigration Branch, Xinjiang Water Conservancy Development and Construction Group Co., Ltd., Urumqi, ChinaEngineering Economics and Immigration Branch, Xinjiang Water Conservancy Development and Construction Group Co., Ltd., Urumqi, ChinaXinjiang Water Conservancy Development and Construction Group Co., Ltd., Urumqi, ChinaXinjiang Water Conservancy Development and Construction Group Co., Ltd., Urumqi, ChinaFaculty of Engineering, China University of Geosciences, Wuhan, ChinaAccurate prediction of tunnel squeezing, one of the common geological hazards during tunnel construction, is of great significance for ensuring construction safety and reducing economic losses. To achieve precise prediction of tunnel squeezing, this study constructed six reliable machine learning (ML) classification models for this purpose, including Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and K-Nearest Neighbors (KNN). The parameters of these 6 ML models were optimized using the Whale Optimization Algorithm (WOA) in conjunction with five-fold cross-validation. A total of 305 tunnel squeezing sample data were collected to train and test the models. KNN and Synthetic Minority Over-sampling Technique (SMOTE) methods were employed to handle the missing and imbalanced data sets. An input feature system for tunnel squeezing prediction was established, comprising tunnel burial depth (H), tunnel diameter (D), strength-to-stress ratio (SSR), and support stiffness (K). The XGBoost model optimized with WOA demonstrated the highest prediction accuracy of 0.9681. The SHAP method was utilized to interpret the XGBoost model, indicating that the contribution rank of the input features to tunnel squeezing prediction was SSR > K > D > H, with average SHAP values of 2.93, 1.49, 0.82, and 0.69, respectively. The XGBoost model was applied to predict tunnel squeezing in 10 sections of the Qinghai Huzhu Beishan Tunnel. The prediction results were highly consistent with the actual outcomes.https://www.frontiersin.org/articles/10.3389/feart.2025.1511413/fulltunnel squeezing predictionmachine learningwhale optimization algorithmmodel interpretationmissing dataset
spellingShingle Peng Guan
Guangzhao Ou
Feng Liang
Weibang Luo
Qingyong Wang
Chengyuan Pei
Xuan Che
Tunnel squeezing prediction based on partially missing dataset and optimized machine learning models
Frontiers in Earth Science
tunnel squeezing prediction
machine learning
whale optimization algorithm
model interpretation
missing dataset
title Tunnel squeezing prediction based on partially missing dataset and optimized machine learning models
title_full Tunnel squeezing prediction based on partially missing dataset and optimized machine learning models
title_fullStr Tunnel squeezing prediction based on partially missing dataset and optimized machine learning models
title_full_unstemmed Tunnel squeezing prediction based on partially missing dataset and optimized machine learning models
title_short Tunnel squeezing prediction based on partially missing dataset and optimized machine learning models
title_sort tunnel squeezing prediction based on partially missing dataset and optimized machine learning models
topic tunnel squeezing prediction
machine learning
whale optimization algorithm
model interpretation
missing dataset
url https://www.frontiersin.org/articles/10.3389/feart.2025.1511413/full
work_keys_str_mv AT pengguan tunnelsqueezingpredictionbasedonpartiallymissingdatasetandoptimizedmachinelearningmodels
AT guangzhaoou tunnelsqueezingpredictionbasedonpartiallymissingdatasetandoptimizedmachinelearningmodels
AT fengliang tunnelsqueezingpredictionbasedonpartiallymissingdatasetandoptimizedmachinelearningmodels
AT weibangluo tunnelsqueezingpredictionbasedonpartiallymissingdatasetandoptimizedmachinelearningmodels
AT qingyongwang tunnelsqueezingpredictionbasedonpartiallymissingdatasetandoptimizedmachinelearningmodels
AT chengyuanpei tunnelsqueezingpredictionbasedonpartiallymissingdatasetandoptimizedmachinelearningmodels
AT xuanche tunnelsqueezingpredictionbasedonpartiallymissingdatasetandoptimizedmachinelearningmodels