Early Parkinson’s disease identification via hybrid feature selection from multi-feature subsets and optimized CatBoost with SMOTE

Achieving high accuracy, efficiency and robustness remains a primary challenge in Parkinson's disease (PD) detection, as existing methods often struggle with these aspects. Additionally, data imbalance in medical datasets further limits the reliability of current models. Given the critical role...

Full description

Saved in:
Bibliographic Details
Main Authors: Subhashree Mohapatra, Bhanja Kishor Swain, Manohar Mishra
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Systems Science & Control Engineering
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/21642583.2025.2498909
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Achieving high accuracy, efficiency and robustness remains a primary challenge in Parkinson's disease (PD) detection, as existing methods often struggle with these aspects. Additionally, data imbalance in medical datasets further limits the reliability of current models. Given the critical role of precise PD classification in medical diagnostics, this study proposes a novel framework to enhance detection accuracy. The proposed framework leverages a strong categorical boosting (CatBoost) algorithm optimized using Grid Search Optimization (GSO). The analysis was conducted on a PD dataset derived from speech recording signals. To address the data imbalance, the synthetic minority oversampling technique (SMOTE) is applied as a pre-processing step to improve the robustness and reliability of the model. The framework is tested with several feature subsets, as well as their combined set. In addition, RReliefF feature selection was also applied to identify the optimal subset of features. For the selected feature subset, the proposed approach achieved an accuracy of 0.9261, precision of 0.9633, sensitivity of 0.9375, F1-score of 0.9502, specificity of 0.8947, AUC of 0.9549 and a testing time of 0.012s. These results underscore the effectiveness of the proposed framework in accurately diagnosing PD, thereby providing a reliable and efficient tool for future medical diagnostics.
ISSN:2164-2583