Early Parkinson’s disease identification via hybrid feature selection from multi-feature subsets and optimized CatBoost with SMOTE
Achieving high accuracy, efficiency and robustness remains a primary challenge in Parkinson's disease (PD) detection, as existing methods often struggle with these aspects. Additionally, data imbalance in medical datasets further limits the reliability of current models. Given the critical role...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2025-12-01
|
| Series: | Systems Science & Control Engineering |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/21642583.2025.2498909 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Achieving high accuracy, efficiency and robustness remains a primary challenge in Parkinson's disease (PD) detection, as existing methods often struggle with these aspects. Additionally, data imbalance in medical datasets further limits the reliability of current models. Given the critical role of precise PD classification in medical diagnostics, this study proposes a novel framework to enhance detection accuracy. The proposed framework leverages a strong categorical boosting (CatBoost) algorithm optimized using Grid Search Optimization (GSO). The analysis was conducted on a PD dataset derived from speech recording signals. To address the data imbalance, the synthetic minority oversampling technique (SMOTE) is applied as a pre-processing step to improve the robustness and reliability of the model. The framework is tested with several feature subsets, as well as their combined set. In addition, RReliefF feature selection was also applied to identify the optimal subset of features. For the selected feature subset, the proposed approach achieved an accuracy of 0.9261, precision of 0.9633, sensitivity of 0.9375, F1-score of 0.9502, specificity of 0.8947, AUC of 0.9549 and a testing time of 0.012s. These results underscore the effectiveness of the proposed framework in accurately diagnosing PD, thereby providing a reliable and efficient tool for future medical diagnostics. |
|---|---|
| ISSN: | 2164-2583 |