Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost
Total organic carbon (TOC) content is an important parameter for evaluating the abundance of organic matter in, and the hydrocarbon production capacity, of shale. Currently, no prediction method is applicable to all geological conditions, so exploring an efficient and accurate prediction method suit...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/7/3447 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850184596613758976 |
|---|---|
| author | Yingjie Meng Chengwu Xu Tingting Li Tianyong Liu Lu Tang Jinyou Zhang |
| author_facet | Yingjie Meng Chengwu Xu Tingting Li Tianyong Liu Lu Tang Jinyou Zhang |
| author_sort | Yingjie Meng |
| collection | DOAJ |
| description | Total organic carbon (TOC) content is an important parameter for evaluating the abundance of organic matter in, and the hydrocarbon production capacity, of shale. Currently, no prediction method is applicable to all geological conditions, so exploring an efficient and accurate prediction method suitable for the study area is of great significance. In this study, for the shale of the Qingshankou Formation of the Gulong Sag in the Songliao Basin, TOC content prediction models using various machine learning algorithms are established and compared based on measured data, principal component analysis, and the particle swarm optimization algorithm. The results showed that GR, AC, DEN, CNL, LLS, and LLD are the most sensitive parameters using the Pearson correlation coefficient. The four principal components were also identified as input features through PCA processing. The XGBoost prediction model, established after selecting the parameters through PSO intelligence, had the highest accuracy with an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></semantics></math></inline-formula> and RMSE of 0.90 and 0.1545, respectively, which are superior to the values of the other models. This model is suitable for the prediction of TOC content and provides effective technical support for shale oil exploration and development in the study area. |
| format | Article |
| id | doaj-art-c4c7cc5e9e7d4e68b8ac81d52e35d5aa |
| institution | OA Journals |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-c4c7cc5e9e7d4e68b8ac81d52e35d5aa2025-08-20T02:17:00ZengMDPI AGApplied Sciences2076-34172025-03-01157344710.3390/app15073447Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoostYingjie Meng0Chengwu Xu1Tingting Li2Tianyong Liu3Lu Tang4Jinyou Zhang5State Key Laboratory of Continental Shale Oil, Daqing 163318, ChinaState Key Laboratory of Continental Shale Oil, Daqing 163318, ChinaSchool of Earth Sciences, Northeast Petroleum University, Daqing 163318, ChinaState Key Laboratory of Continental Shale Oil, Daqing 163318, ChinaSchool of Earth Sciences, Northeast Petroleum University, Daqing 163318, ChinaState Key Laboratory of Continental Shale Oil, Daqing 163318, ChinaTotal organic carbon (TOC) content is an important parameter for evaluating the abundance of organic matter in, and the hydrocarbon production capacity, of shale. Currently, no prediction method is applicable to all geological conditions, so exploring an efficient and accurate prediction method suitable for the study area is of great significance. In this study, for the shale of the Qingshankou Formation of the Gulong Sag in the Songliao Basin, TOC content prediction models using various machine learning algorithms are established and compared based on measured data, principal component analysis, and the particle swarm optimization algorithm. The results showed that GR, AC, DEN, CNL, LLS, and LLD are the most sensitive parameters using the Pearson correlation coefficient. The four principal components were also identified as input features through PCA processing. The XGBoost prediction model, established after selecting the parameters through PSO intelligence, had the highest accuracy with an <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></semantics></math></inline-formula> and RMSE of 0.90 and 0.1545, respectively, which are superior to the values of the other models. This model is suitable for the prediction of TOC content and provides effective technical support for shale oil exploration and development in the study area.https://www.mdpi.com/2076-3417/15/7/3447shaletotal organic carbonprincipal component analysisparticle swarm optimizationmachine learning |
| spellingShingle | Yingjie Meng Chengwu Xu Tingting Li Tianyong Liu Lu Tang Jinyou Zhang Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost Applied Sciences shale total organic carbon principal component analysis particle swarm optimization machine learning |
| title | Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost |
| title_full | Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost |
| title_fullStr | Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost |
| title_full_unstemmed | Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost |
| title_short | Prediction of Total Organic Carbon Content in Shale Based on PCA-PSO-XGBoost |
| title_sort | prediction of total organic carbon content in shale based on pca pso xgboost |
| topic | shale total organic carbon principal component analysis particle swarm optimization machine learning |
| url | https://www.mdpi.com/2076-3417/15/7/3447 |
| work_keys_str_mv | AT yingjiemeng predictionoftotalorganiccarboncontentinshalebasedonpcapsoxgboost AT chengwuxu predictionoftotalorganiccarboncontentinshalebasedonpcapsoxgboost AT tingtingli predictionoftotalorganiccarboncontentinshalebasedonpcapsoxgboost AT tianyongliu predictionoftotalorganiccarboncontentinshalebasedonpcapsoxgboost AT lutang predictionoftotalorganiccarboncontentinshalebasedonpcapsoxgboost AT jinyouzhang predictionoftotalorganiccarboncontentinshalebasedonpcapsoxgboost |