Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features

Abstract Parkinson's disease (PD), the second most prevalent neurodegenerative disorder after Alzheimer's disease, affects approximately 10 million individuals worldwide. The disease is characterized by both motor and non-motor symptoms, and clinical aspects are pivotal for diagnosis. Voca...

Full description

Saved in:
Bibliographic Details
Main Authors: Daniel Hilário da Silva, Caio Tonus Ribeiro, Leandro Rodrigues da Silva Souza, Adriano Alves Pereira
Format: Article
Language:English
Published: Instituto de Tecnologia do Paraná (Tecpar) 2025-03-01
Series:Brazilian Archives of Biology and Technology
Subjects:
Online Access:http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1516-89132025000100602&lng=en&tlng=en
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849703614614863872
author Daniel Hilário da Silva
Caio Tonus Ribeiro
Leandro Rodrigues da Silva Souza
Adriano Alves Pereira
author_facet Daniel Hilário da Silva
Caio Tonus Ribeiro
Leandro Rodrigues da Silva Souza
Adriano Alves Pereira
author_sort Daniel Hilário da Silva
collection DOAJ
description Abstract Parkinson's disease (PD), the second most prevalent neurodegenerative disorder after Alzheimer's disease, affects approximately 10 million individuals worldwide. The disease is characterized by both motor and non-motor symptoms, and clinical aspects are pivotal for diagnosis. Vocal abnormalities can be identified in about 90% of PD patients in the early stages of the condition. Machine Learning (ML), a prominent subfield of Artificial Intelligence (AI), holds significant promise in the medical domain, particularly for early disease detection, enabling effective preventive measures and treatments. In this paper, we considered the unique characteristics of each ML algorithm. Seventeen ML algorithms were applied to a dataset of voice recordings from Healthy Control and PD individuals, sourced from a publicly available repository. We leveraged the PyCaret Python library's ML algorithms and functions, which were introduced in this article, to demonstrate their simplicity and effectiveness in dealing with real-world data. Among these algorithms, Extra Trees Classifier (ETC), Gradient Boosting Classifier (GBC), and K Neighbors Classifier (KNN) exhibited the best performance for the given dataset. Furthermore, to enhance the models' performance, we employed various techniques, including Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance, feature selection based on correlation, and hyperparameter tuning. Our findings highlight the potential of the PyCaret ML library demonstrated in this article as a valuable tool for applying ML to the classification of Parkinson's disease through voice analysis. The application of ML in this context can greatly support clinical decision-making, leading to more informed and precise interventions.
format Article
id doaj-art-c1ac04454bb94e48abe3088e16b7f6f3
institution DOAJ
issn 1678-4324
language English
publishDate 2025-03-01
publisher Instituto de Tecnologia do Paraná (Tecpar)
record_format Article
series Brazilian Archives of Biology and Technology
spelling doaj-art-c1ac04454bb94e48abe3088e16b7f6f32025-08-20T03:17:13ZengInstituto de Tecnologia do Paraná (Tecpar)Brazilian Archives of Biology and Technology1678-43242025-03-016810.1590/1678-4324-2025230860Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal FeaturesDaniel Hilário da Silvahttps://orcid.org/0000-0002-0800-065XCaio Tonus Ribeirohttps://orcid.org/0000-0003-0085-317XLeandro Rodrigues da Silva Souzahttps://orcid.org/0000-0003-2477-6893Adriano Alves Pereirahttps://orcid.org/0000-0002-1522-9989Abstract Parkinson's disease (PD), the second most prevalent neurodegenerative disorder after Alzheimer's disease, affects approximately 10 million individuals worldwide. The disease is characterized by both motor and non-motor symptoms, and clinical aspects are pivotal for diagnosis. Vocal abnormalities can be identified in about 90% of PD patients in the early stages of the condition. Machine Learning (ML), a prominent subfield of Artificial Intelligence (AI), holds significant promise in the medical domain, particularly for early disease detection, enabling effective preventive measures and treatments. In this paper, we considered the unique characteristics of each ML algorithm. Seventeen ML algorithms were applied to a dataset of voice recordings from Healthy Control and PD individuals, sourced from a publicly available repository. We leveraged the PyCaret Python library's ML algorithms and functions, which were introduced in this article, to demonstrate their simplicity and effectiveness in dealing with real-world data. Among these algorithms, Extra Trees Classifier (ETC), Gradient Boosting Classifier (GBC), and K Neighbors Classifier (KNN) exhibited the best performance for the given dataset. Furthermore, to enhance the models' performance, we employed various techniques, including Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance, feature selection based on correlation, and hyperparameter tuning. Our findings highlight the potential of the PyCaret ML library demonstrated in this article as a valuable tool for applying ML to the classification of Parkinson's disease through voice analysis. The application of ML in this context can greatly support clinical decision-making, leading to more informed and precise interventions.http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1516-89132025000100602&lng=en&tlng=enmachine learningmedical diagnosisParkinson’s DiseasePyCaretvoice signal.
spellingShingle Daniel Hilário da Silva
Caio Tonus Ribeiro
Leandro Rodrigues da Silva Souza
Adriano Alves Pereira
Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features
Brazilian Archives of Biology and Technology
machine learning
medical diagnosis
Parkinson’s Disease
PyCaret
voice signal.
title Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features
title_full Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features
title_fullStr Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features
title_full_unstemmed Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features
title_short Application of Open-Source, Low-Code Machine-Learning Library in Python to Diagnose Parkinson's Disease Using Voice Signal Features
title_sort application of open source low code machine learning library in python to diagnose parkinson s disease using voice signal features
topic machine learning
medical diagnosis
Parkinson’s Disease
PyCaret
voice signal.
url http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1516-89132025000100602&lng=en&tlng=en
work_keys_str_mv AT danielhilariodasilva applicationofopensourcelowcodemachinelearninglibraryinpythontodiagnoseparkinsonsdiseaseusingvoicesignalfeatures
AT caiotonusribeiro applicationofopensourcelowcodemachinelearninglibraryinpythontodiagnoseparkinsonsdiseaseusingvoicesignalfeatures
AT leandrorodriguesdasilvasouza applicationofopensourcelowcodemachinelearninglibraryinpythontodiagnoseparkinsonsdiseaseusingvoicesignalfeatures
AT adrianoalvespereira applicationofopensourcelowcodemachinelearninglibraryinpythontodiagnoseparkinsonsdiseaseusingvoicesignalfeatures