Feature selection based on Mahalanobis distance for early Parkinson disease classification

Standard classifiers struggle with high-dimensional datasets due to increased computational complexity, difficulty in visualization and interpretation, and challenges in handling redundant or irrelevant features. This paper proposes a novel feature selection method based on the Mahalanobis distance...

Full description

Saved in:
Bibliographic Details
Main Authors: Mustafa Noaman Kadhim, Dhiah Al-Shammary, Ahmed M. Mahdi, Ayman Ibaida
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Computer Methods and Programs in Biomedicine Update
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666990025000011
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832595252278460416
author Mustafa Noaman Kadhim
Dhiah Al-Shammary
Ahmed M. Mahdi
Ayman Ibaida
author_facet Mustafa Noaman Kadhim
Dhiah Al-Shammary
Ahmed M. Mahdi
Ayman Ibaida
author_sort Mustafa Noaman Kadhim
collection DOAJ
description Standard classifiers struggle with high-dimensional datasets due to increased computational complexity, difficulty in visualization and interpretation, and challenges in handling redundant or irrelevant features. This paper proposes a novel feature selection method based on the Mahalanobis distance for Parkinson's disease (PD) classification. The proposed feature selection identifies relevant features by measuring their distance from the dataset's mean vector, considering the covariance structure. Features with larger Mahalanobis distances are deemed more relevant as they exhibit greater discriminative power relative to the dataset's distribution, aiding in effective feature subset selection. Significant improvements in classification performance were observed across all models. On the ''Parkinson Disease Classification Dataset'', the feature set was reduced from 22 to 11 features, resulting in accuracy improvements ranging from 10.17 % to 20.34 %, with the K-Nearest Neighbors (KNN) classifier achieving the highest accuracy of 98.31 %. Similarly, on the ''Parkinson Dataset with Replicated Acoustic Features'', the feature set was reduced from 45 to 18 features, achieving accuracy improvements ranging from 1.38 % to 13.88 %, with the Random Forest (RF) classifier achieving the best accuracy of 95.83 %. By identifying convergence features and eliminating divergence features, the proposed method effectively reduces dimensionality while maintaining or improving classifier performance. Additionally, the proposed feature selection method significantly reduces execution time, making it highly suitable for real-time applications in medical diagnostics, where timely and accurate disease identification is critical for improving patient outcomes.
format Article
id doaj-art-5a20c7e6301e4b7e8bd5383cf5195f51
institution Kabale University
issn 2666-9900
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Computer Methods and Programs in Biomedicine Update
spelling doaj-art-5a20c7e6301e4b7e8bd5383cf5195f512025-01-19T06:26:50ZengElsevierComputer Methods and Programs in Biomedicine Update2666-99002025-01-017100177Feature selection based on Mahalanobis distance for early Parkinson disease classificationMustafa Noaman Kadhim0Dhiah Al-Shammary1Ahmed M. Mahdi2Ayman Ibaida3College of Computer Science and Information Technology, University of Al-Qadisiyah, Dewaniyah, IraqCollege of Computer Science and Information Technology, University of Al-Qadisiyah, Dewaniyah, IraqCollege of Computer Science and Information Technology, University of Al-Qadisiyah, Dewaniyah, IraqIntelligent Technology Innovation Lab, Victoria University, Melbourne 3011, Australia; Corresponding author.Standard classifiers struggle with high-dimensional datasets due to increased computational complexity, difficulty in visualization and interpretation, and challenges in handling redundant or irrelevant features. This paper proposes a novel feature selection method based on the Mahalanobis distance for Parkinson's disease (PD) classification. The proposed feature selection identifies relevant features by measuring their distance from the dataset's mean vector, considering the covariance structure. Features with larger Mahalanobis distances are deemed more relevant as they exhibit greater discriminative power relative to the dataset's distribution, aiding in effective feature subset selection. Significant improvements in classification performance were observed across all models. On the ''Parkinson Disease Classification Dataset'', the feature set was reduced from 22 to 11 features, resulting in accuracy improvements ranging from 10.17 % to 20.34 %, with the K-Nearest Neighbors (KNN) classifier achieving the highest accuracy of 98.31 %. Similarly, on the ''Parkinson Dataset with Replicated Acoustic Features'', the feature set was reduced from 45 to 18 features, achieving accuracy improvements ranging from 1.38 % to 13.88 %, with the Random Forest (RF) classifier achieving the best accuracy of 95.83 %. By identifying convergence features and eliminating divergence features, the proposed method effectively reduces dimensionality while maintaining or improving classifier performance. Additionally, the proposed feature selection method significantly reduces execution time, making it highly suitable for real-time applications in medical diagnostics, where timely and accurate disease identification is critical for improving patient outcomes.http://www.sciencedirect.com/science/article/pii/S2666990025000011Parkinson disease classificationMahalanobis distanceFeature selectionVoice classificationMachine learning classifiers
spellingShingle Mustafa Noaman Kadhim
Dhiah Al-Shammary
Ahmed M. Mahdi
Ayman Ibaida
Feature selection based on Mahalanobis distance for early Parkinson disease classification
Computer Methods and Programs in Biomedicine Update
Parkinson disease classification
Mahalanobis distance
Feature selection
Voice classification
Machine learning classifiers
title Feature selection based on Mahalanobis distance for early Parkinson disease classification
title_full Feature selection based on Mahalanobis distance for early Parkinson disease classification
title_fullStr Feature selection based on Mahalanobis distance for early Parkinson disease classification
title_full_unstemmed Feature selection based on Mahalanobis distance for early Parkinson disease classification
title_short Feature selection based on Mahalanobis distance for early Parkinson disease classification
title_sort feature selection based on mahalanobis distance for early parkinson disease classification
topic Parkinson disease classification
Mahalanobis distance
Feature selection
Voice classification
Machine learning classifiers
url http://www.sciencedirect.com/science/article/pii/S2666990025000011
work_keys_str_mv AT mustafanoamankadhim featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification
AT dhiahalshammary featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification
AT ahmedmmahdi featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification
AT aymanibaida featureselectionbasedonmahalanobisdistanceforearlyparkinsondiseaseclassification