Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model

Communication disorders, particularly dysarthria, significantly impact individuals by impairing their speech clarity, social interactions, and overall well-being. Early and accurate detection is crucial to enable timely intervention and improve speech therapy outcomes. This study introduces Adaptive...

Full description

Saved in:
Bibliographic Details
Main Authors: Jagat Chaitanya Prabhala, Ravi Ragoju, Venkatanareshbabu Kuppili, Christophe Chesneau
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:Machine Learning with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666827025001045
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849768147426476032
author Jagat Chaitanya Prabhala
Ravi Ragoju
Venkatanareshbabu Kuppili
Christophe Chesneau
author_facet Jagat Chaitanya Prabhala
Ravi Ragoju
Venkatanareshbabu Kuppili
Christophe Chesneau
author_sort Jagat Chaitanya Prabhala
collection DOAJ
description Communication disorders, particularly dysarthria, significantly impact individuals by impairing their speech clarity, social interactions, and overall well-being. Early and accurate detection is crucial to enable timely intervention and improve speech therapy outcomes. This study introduces Adaptive Dysarthric Speech Disability Detection using Stacked Ensemble Deep Learning (ADSDD-SEDL), an innovative ensemble-based deep-learning framework for dysarthria detection. The proposed model integrates three deep learning architectures—Multi-Head Attention-based Long Short-Term Memory (MHALSTM), Deep Belief Network (DBN), and Time-Delay Neural Network (TDNN)—within a stacked ensemble model. Unlike conventional stacking methods that use fixed meta-classifiers, this study employs a Genetic Algorithm (GA)-based optimization strategy to dynamically determine optimal weight contributions of the base models, enhancing classification robustness and adaptability.The preprocessing pipeline converts speech signals from the time domain to the frequency domain by using a Short-Time Fourier Transform (STFT). Mel-Frequency Cepstral Coefficients (MFCCs) were extracted to capture the key spectral characteristics. Each base model underwent independent training, and the GA optimized the ensemble by evolving an adaptive weight distribution instead of relying on predefined fusion methods. Extensive simulations and hyperparameter tuning confirmed that the GA-optimized ADSDD-SEDL technique significantly improved detection efficiency over traditional ensemble approaches. These findings underscore the advantages of evolutionary optimization in refining speech disorder classification models. This scalable and adaptive model offers a valuable tool for healthcare professionals, enabling precise and automated early diagnosis of dysarthria. Future research could explore alternative evolutionary algorithms, reinforcement learning techniques, and hybrid deep learning approaches to enhance speech disorder classification.
format Article
id doaj-art-ee81e2e9d58949e9b266b4603b92b670
institution DOAJ
issn 2666-8270
language English
publishDate 2025-09-01
publisher Elsevier
record_format Article
series Machine Learning with Applications
spelling doaj-art-ee81e2e9d58949e9b266b4603b92b6702025-08-20T03:03:55ZengElsevierMachine Learning with Applications2666-82702025-09-012110072110.1016/j.mlwa.2025.100721Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning modelJagat Chaitanya Prabhala0Ravi Ragoju1Venkatanareshbabu Kuppili2Christophe Chesneau3Applied Sciences, National Institute of Technology, Goa, IndiaApplied Sciences, National Institute of Technology, Goa, IndiaComputer Science and Engineering, National Institute of Technology, Goa, IndiaLMNO, University of Caen-Normandy, Caen, France; Corresponding author.Communication disorders, particularly dysarthria, significantly impact individuals by impairing their speech clarity, social interactions, and overall well-being. Early and accurate detection is crucial to enable timely intervention and improve speech therapy outcomes. This study introduces Adaptive Dysarthric Speech Disability Detection using Stacked Ensemble Deep Learning (ADSDD-SEDL), an innovative ensemble-based deep-learning framework for dysarthria detection. The proposed model integrates three deep learning architectures—Multi-Head Attention-based Long Short-Term Memory (MHALSTM), Deep Belief Network (DBN), and Time-Delay Neural Network (TDNN)—within a stacked ensemble model. Unlike conventional stacking methods that use fixed meta-classifiers, this study employs a Genetic Algorithm (GA)-based optimization strategy to dynamically determine optimal weight contributions of the base models, enhancing classification robustness and adaptability.The preprocessing pipeline converts speech signals from the time domain to the frequency domain by using a Short-Time Fourier Transform (STFT). Mel-Frequency Cepstral Coefficients (MFCCs) were extracted to capture the key spectral characteristics. Each base model underwent independent training, and the GA optimized the ensemble by evolving an adaptive weight distribution instead of relying on predefined fusion methods. Extensive simulations and hyperparameter tuning confirmed that the GA-optimized ADSDD-SEDL technique significantly improved detection efficiency over traditional ensemble approaches. These findings underscore the advantages of evolutionary optimization in refining speech disorder classification models. This scalable and adaptive model offers a valuable tool for healthcare professionals, enabling precise and automated early diagnosis of dysarthria. Future research could explore alternative evolutionary algorithms, reinforcement learning techniques, and hybrid deep learning approaches to enhance speech disorder classification.http://www.sciencedirect.com/science/article/pii/S2666827025001045Dysarthric speech detectionDeep learningSpeech disordersAutomatic speech recognition (ASR)Feature extractionSpeech enhancement techniques
spellingShingle Jagat Chaitanya Prabhala
Ravi Ragoju
Venkatanareshbabu Kuppili
Christophe Chesneau
Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
Machine Learning with Applications
Dysarthric speech detection
Deep learning
Speech disorders
Automatic speech recognition (ASR)
Feature extraction
Speech enhancement techniques
title Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
title_full Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
title_fullStr Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
title_full_unstemmed Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
title_short Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
title_sort enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
topic Dysarthric speech detection
Deep learning
Speech disorders
Automatic speech recognition (ASR)
Feature extraction
Speech enhancement techniques
url http://www.sciencedirect.com/science/article/pii/S2666827025001045
work_keys_str_mv AT jagatchaitanyaprabhala enhancedearlydetectionofdysarthricspeechdisabilitiesusingstackingensembledeeplearningmodel
AT raviragoju enhancedearlydetectionofdysarthricspeechdisabilitiesusingstackingensembledeeplearningmodel
AT venkatanareshbabukuppili enhancedearlydetectionofdysarthricspeechdisabilitiesusingstackingensembledeeplearningmodel
AT christophechesneau enhancedearlydetectionofdysarthricspeechdisabilitiesusingstackingensembledeeplearningmodel