Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods

Background/Objectives: Parkinson’s disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal pattern...

Full description

Saved in:
Bibliographic Details
Main Authors: Cristian David Ríos-Urrego, Daniel Escobar-Grisales, Juan Rafael Orozco-Arroyave
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/15/1/73
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549306728808448
author Cristian David Ríos-Urrego
Daniel Escobar-Grisales
Juan Rafael Orozco-Arroyave
author_facet Cristian David Ríos-Urrego
Daniel Escobar-Grisales
Juan Rafael Orozco-Arroyave
author_sort Cristian David Ríos-Urrego
collection DOAJ
description Background/Objectives: Parkinson’s disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients. Methods: This paper introduces, for the first time, a new methodology that performs the synchronous fusion of information extracted from speech recordings and their corresponding videos of lip movement, namely the bimodal approach. Results: Our results indicate that the introduced method is more accurate and suitable than unimodal approaches or classical asynchronous approaches that combine both sources of information but do not incorporate the underlying temporal information. Conclusions: This study demonstrates that using a synchronous fusion strategy with concatenated projections based on attention mechanisms, i.e., speech-to-lips and lips-to-speech, exceeds previous results reported in the literature. Complementary information between lip movement and speech production is confirmed when advanced fusion strategies are employed. Finally, multimodal approaches, combining visual and speech signals, showed great potential to improve PD classification, generating more confident and robust models for clinical diagnostic support.
format Article
id doaj-art-d885bed5e6614902b4c91938127c2952
institution Kabale University
issn 2075-4418
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj-art-d885bed5e6614902b4c91938127c29522025-01-10T13:16:38ZengMDPI AGDiagnostics2075-44182024-12-011517310.3390/diagnostics15010073Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning MethodsCristian David Ríos-Urrego0Daniel Escobar-Grisales1Juan Rafael Orozco-Arroyave2GITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, ColombiaGITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, ColombiaGITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, ColombiaBackground/Objectives: Parkinson’s disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients. Methods: This paper introduces, for the first time, a new methodology that performs the synchronous fusion of information extracted from speech recordings and their corresponding videos of lip movement, namely the bimodal approach. Results: Our results indicate that the introduced method is more accurate and suitable than unimodal approaches or classical asynchronous approaches that combine both sources of information but do not incorporate the underlying temporal information. Conclusions: This study demonstrates that using a synchronous fusion strategy with concatenated projections based on attention mechanisms, i.e., speech-to-lips and lips-to-speech, exceeds previous results reported in the literature. Complementary information between lip movement and speech production is confirmed when advanced fusion strategies are employed. Finally, multimodal approaches, combining visual and speech signals, showed great potential to improve PD classification, generating more confident and robust models for clinical diagnostic support.https://www.mdpi.com/2075-4418/15/1/73Parkinson’s diseasespeech analysislip movement analysisfusion methodsattention mechanisms
spellingShingle Cristian David Ríos-Urrego
Daniel Escobar-Grisales
Juan Rafael Orozco-Arroyave
Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods
Diagnostics
Parkinson’s disease
speech analysis
lip movement analysis
fusion methods
attention mechanisms
title Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods
title_full Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods
title_fullStr Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods
title_full_unstemmed Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods
title_short Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson’s Disease Using Deep Learning Methods
title_sort synchronous analysis of speech production and lips movement to detect parkinson s disease using deep learning methods
topic Parkinson’s disease
speech analysis
lip movement analysis
fusion methods
attention mechanisms
url https://www.mdpi.com/2075-4418/15/1/73
work_keys_str_mv AT cristiandavidriosurrego synchronousanalysisofspeechproductionandlipsmovementtodetectparkinsonsdiseaseusingdeeplearningmethods
AT danielescobargrisales synchronousanalysisofspeechproductionandlipsmovementtodetectparkinsonsdiseaseusingdeeplearningmethods
AT juanrafaelorozcoarroyave synchronousanalysisofspeechproductionandlipsmovementtodetectparkinsonsdiseaseusingdeeplearningmethods