Feature selection based on Mahalanobis distance for early Parkinson disease classification
Standard classifiers struggle with high-dimensional datasets due to increased computational complexity, difficulty in visualization and interpretation, and challenges in handling redundant or irrelevant features. This paper proposes a novel feature selection method based on the Mahalanobis distance...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-01-01
|
Series: | Computer Methods and Programs in Biomedicine Update |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666990025000011 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832595252278460416 |
---|---|
author | Mustafa Noaman Kadhim Dhiah Al-Shammary Ahmed M. Mahdi Ayman Ibaida |
author_facet | Mustafa Noaman Kadhim Dhiah Al-Shammary Ahmed M. Mahdi Ayman Ibaida |
author_sort | Mustafa Noaman Kadhim |
collection | DOAJ |
description | Standard classifiers struggle with high-dimensional datasets due to increased computational complexity, difficulty in visualization and interpretation, and challenges in handling redundant or irrelevant features. This paper proposes a novel feature selection method based on the Mahalanobis distance for Parkinson's disease (PD) classification. The proposed feature selection identifies relevant features by measuring their distance from the dataset's mean vector, considering the covariance structure. Features with larger Mahalanobis distances are deemed more relevant as they exhibit greater discriminative power relative to the dataset's distribution, aiding in effective feature subset selection. Significant improvements in classification performance were observed across all models. On the ''Parkinson Disease Classification Dataset'', the feature set was reduced from 22 to 11 features, resulting in accuracy improvements ranging from 10.17 % to 20.34 %, with the K-Nearest Neighbors (KNN) classifier achieving the highest accuracy of 98.31 %. Similarly, on the ''Parkinson Dataset with Replicated Acoustic Features'', the feature set was reduced from 45 to 18 features, achieving accuracy improvements ranging from 1.38 % to 13.88 %, with the Random Forest (RF) classifier achieving the best accuracy of 95.83 %. By identifying convergence features and eliminating divergence features, the proposed method effectively reduces dimensionality while maintaining or improving classifier performance. Additionally, the proposed feature selection method significantly reduces execution time, making it highly suitable for real-time applications in medical diagnostics, where timely and accurate disease identification is critical for improving patient outcomes. |
format | Article |
id | doaj-art-5a20c7e6301e4b7e8bd5383cf5195f51 |
institution | Kabale University |
issn | 2666-9900 |
language | English |
publishDate | 2025-01-01 |
publisher | Elsevier |
record_format | Article |
series | Computer Methods and Programs in Biomedicine Update |
spelling | doaj-art-5a20c7e6301e4b7e8bd5383cf5195f512025-01-19T06:26:50ZengElsevierComputer Methods and Programs in Biomedicine Update2666-99002025-01-017100177Feature selection based on Mahalanobis distance for early Parkinson disease classificationMustafa Noaman Kadhim0Dhiah Al-Shammary1Ahmed M. Mahdi2Ayman Ibaida3College of Computer Science and Information Technology, University of Al-Qadisiyah, Dewaniyah, IraqCollege of Computer Science and Information Technology, University of Al-Qadisiyah, Dewaniyah, IraqCollege of Computer Science and Information Technology, University of Al-Qadisiyah, Dewaniyah, IraqIntelligent Technology Innovation Lab, Victoria University, Melbourne 3011, Australia; Corresponding author.Standard classifiers struggle with high-dimensional datasets due to increased computational complexity, difficulty in visualization and interpretation, and challenges in handling redundant or irrelevant features. This paper proposes a novel feature selection method based on the Mahalanobis distance for Parkinson's disease (PD) classification. The proposed feature selection identifies relevant features by measuring their distance from the dataset's mean vector, considering the covariance structure. Features with larger Mahalanobis distances are deemed more relevant as they exhibit greater discriminative power relative to the dataset's distribution, aiding in effective feature subset selection. Significant improvements in classification performance were observed across all models. On the ''Parkinson Disease Classification Dataset'', the feature set was reduced from 22 to 11 features, resulting in accuracy improvements ranging from 10.17 % to 20.34 %, with the K-Nearest Neighbors (KNN) classifier achieving the highest accuracy of 98.31 %. Similarly, on the ''Parkinson Dataset with Replicated Acoustic Features'', the feature set was reduced from 45 to 18 features, achieving accuracy improvements ranging from 1.38 % to 13.88 %, with the Random Forest (RF) classifier achieving the best accuracy of 95.83 %. By identifying convergence features and eliminating divergence features, the proposed method effectively reduces dimensionality while maintaining or improving classifier performance. Additionally, the proposed feature selection method significantly reduces execution time, making it highly suitable for real-time applications in medical diagnostics, where timely and accurate disease identification is critical for improving patient outcomes.http://www.sciencedirect.com/science/article/pii/S2666990025000011Parkinson disease classificationMahalanobis distanceFeature selectionVoice classificationMachine learning classifiers |
spellingShingle | Mustafa Noaman Kadhim Dhiah Al-Shammary Ahmed M. Mahdi Ayman Ibaida Feature selection based on Mahalanobis distance for early Parkinson disease classification Computer Methods and Programs in Biomedicine Update Parkinson disease classification Mahalanobis distance Feature selection Voice classification Machine learning classifiers |
title | Feature selection based on Mahalanobis distance for early Parkinson disease classification |
title_full | Feature selection based on Mahalanobis distance for early Parkinson disease classification |
title_fullStr | Feature selection based on Mahalanobis distance for early Parkinson disease classification |
title_full_unstemmed | Feature selection based on Mahalanobis distance for early Parkinson disease classification |
title_short | Feature selection based on Mahalanobis distance for early Parkinson disease classification |
title_sort | feature selection based on mahalanobis distance for early parkinson disease classification |
topic | Parkinson disease classification Mahalanobis distance Feature selection Voice classification Machine learning classifiers |
url | http://www.sciencedirect.com/science/article/pii/S2666990025000011 |
work_keys_str_mv | AT mustafanoamankadhim featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification AT dhiahalshammary featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification AT ahmedmmahdi featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification AT aymanibaida featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification |