Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods
Background/Objectives: Parkinson’s disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal pattern...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Diagnostics |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-4418/15/1/73 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841549306728808448 |
---|---|
author | Cristian David Ríos-Urrego Daniel Escobar-Grisales Juan Rafael Orozco-Arroyave |
author_facet | Cristian David Ríos-Urrego Daniel Escobar-Grisales Juan Rafael Orozco-Arroyave |
author_sort | Cristian David Ríos-Urrego |
collection | DOAJ |
description | Background/Objectives: Parkinson’s disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients. Methods: This paper introduces, for the first time, a new methodology that performs the synchronous fusion of information extracted from speech recordings and their corresponding videos of lip movement, namely the bimodal approach. Results: Our results indicate that the introduced method is more accurate and suitable than unimodal approaches or classical asynchronous approaches that combine both sources of information but do not incorporate the underlying temporal information. Conclusions: This study demonstrates that using a synchronous fusion strategy with concatenated projections based on attention mechanisms, i.e., speech-to-lips and lips-to-speech, exceeds previous results reported in the literature. Complementary information between lip movement and speech production is confirmed when advanced fusion strategies are employed. Finally, multimodal approaches, combining visual and speech signals, showed great potential to improve PD classification, generating more confident and robust models for clinical diagnostic support. |
format | Article |
id | doaj-art-d885bed5e6614902b4c91938127c2952 |
institution | Kabale University |
issn | 2075-4418 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Diagnostics |
spelling | doaj-art-d885bed5e6614902b4c91938127c29522025-01-10T13:16:38ZengMDPI AGDiagnostics2075-44182024-12-011517310.3390/diagnostics15010073Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning MethodsCristian David Ríos-Urrego0Daniel Escobar-Grisales1Juan Rafael Orozco-Arroyave2GITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, ColombiaGITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, ColombiaGITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, ColombiaBackground/Objectives: Parkinson’s disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients. Methods: This paper introduces, for the first time, a new methodology that performs the synchronous fusion of information extracted from speech recordings and their corresponding videos of lip movement, namely the bimodal approach. Results: Our results indicate that the introduced method is more accurate and suitable than unimodal approaches or classical asynchronous approaches that combine both sources of information but do not incorporate the underlying temporal information. Conclusions: This study demonstrates that using a synchronous fusion strategy with concatenated projections based on attention mechanisms, i.e., speech-to-lips and lips-to-speech, exceeds previous results reported in the literature. Complementary information between lip movement and speech production is confirmed when advanced fusion strategies are employed. Finally, multimodal approaches, combining visual and speech signals, showed great potential to improve PD classification, generating more confident and robust models for clinical diagnostic support.https://www.mdpi.com/2075-4418/15/1/73Parkinson’s diseasespeech analysislip movement analysisfusion methodsattention mechanisms |
spellingShingle | Cristian David Ríos-Urrego Daniel Escobar-Grisales Juan Rafael Orozco-Arroyave Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods Diagnostics Parkinson’s disease speech analysis lip movement analysis fusion methods attention mechanisms |
title | Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods |
title_full | Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods |
title_fullStr | Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods |
title_full_unstemmed | Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods |
title_short | Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods |
title_sort | synchronous analysis of speech production and lips movement to detect parkinson s disease using deep learning methods |
topic | Parkinson’s disease speech analysis lip movement analysis fusion methods attention mechanisms |
url | https://www.mdpi.com/2075-4418/15/1/73 |
work_keys_str_mv | AT cristiandavidriosurrego synchronousanalysisofspeechproductionandlipsmovementtodetectparkinsonsdiseaseusingdeeplearningmethods AT danielescobargrisales synchronousanalysisofspeechproductionandlipsmovementtodetectparkinsonsdiseaseusingdeeplearningmethods AT juanrafaelorozcoarroyave synchronousanalysisofspeechproductionandlipsmovementtodetectparkinsonsdiseaseusingdeeplearningmethods |