Drilling Condition Identification Method for Imbalanced Datasets
To address the challenges posed by class imbalance and temporal dependency in drilling condition data and enhance the accuracy of condition identification, this study proposes an integrated method combining feature engineering, data resampling, and deep learning model optimization. Firstly, a featur...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/6/3362 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | To address the challenges posed by class imbalance and temporal dependency in drilling condition data and enhance the accuracy of condition identification, this study proposes an integrated method combining feature engineering, data resampling, and deep learning model optimization. Firstly, a feature selection strategy based on weighted symmetrical uncertainty is employed, assigning higher weights to critical features that distinguish minority classes, thereby enhancing class contrast and improving the classification capability of the model. Secondly, a sliding-window-based Synthetic Minority Oversampling Technique (SMOTE) algorithm is developed, which generates new minority-class samples while preserving temporal dependencies, achieving balanced data distribution among classes. Finally, a coupled model integrating bidirectional long short-term memory (BiLSTM) networks and gated recurrent units (GRUs) is constructed. The BiLSTM component captures global contextual information, while the GRU efficiently learns features from complex sequential data. The proposed approach was validated using logging data from 14 wells and compared against existing models, including RNN, CNN, FCN, and LSTM. The experimental results demonstrated that the proposed method achieved classification <i>F1</i> score improvements of 8.95%, 9.58%, 10.25%, and 8.59%, respectively, over these traditional models. Additionally, classification loss values were reduced by 0.32, 0.3315, 0.2893, and 0.2246, respectively. These findings underscore the significant improvements in both accuracy and balance achieved by the proposed method for drilling condition identification. The results indicate that the proposed approach effectively addresses class imbalance and temporal dependency issues in drilling condition data, substantially enhancing classification performance for complex sequential data. This work provides a practical and efficient solution for drilling condition recognition. |
|---|---|
| ISSN: | 2076-3417 |