Differentiability of voice disorders through explainable AI

Abstract The voice can be affected by various types of pathology. The phoniatric medical examination is the acoustic analysis, which evaluates the characteristic parameters extracted from the vocal signal. Computer-assisted decision-making systems can help specialists to detect vocal pathologies usi...

Full description

Saved in:
Bibliographic Details
Main Author: Fatma Özcan
Format: Article
Language:English
Published: Nature Portfolio 2025-05-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-03444-3
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849688003932323840
author Fatma Özcan
author_facet Fatma Özcan
author_sort Fatma Özcan
collection DOAJ
description Abstract The voice can be affected by various types of pathology. The phoniatric medical examination is the acoustic analysis, which evaluates the characteristic parameters extracted from the vocal signal. Computer-assisted decision-making systems can help specialists to detect vocal pathologies using only the patient’s voice. In this study, transfer learning techniques are used to perform the acoustic analysis. Fine-tuned OpenL3 then predicts whether or not the signals contain a pathology by classifying them under 8 different pathologies. A publicly available dataset is used with the categories Hyperkinetic dysphonia, Hypokinetic dysphonia, reflux laryngitis vocal fold nodules, prolapse, glottic insufficiency and vocal fold paralysis in addition to the Healthy class. The results obtained are very convincing. The accuracy with OpenL3, using tranfer learning, was 99.44%. In addition, explainable decision support systems (XDSS) provide an in-depth understanding of the decision-making process. Obtaining an image resulting from the averaging of all the Occlusion Sensitivity maps will enable us to understand the spatio-temporal characteristics of the disordered voices used for classification. Thanks to explainability methods, a new term, the differentiability, can be discussed to explain the black-box operation of deep networks. For purposes of rapid diagnosis and prevention, this work could provide more detail on disordered voices by enabling a promising explainable diagnosis.
format Article
id doaj-art-2b1602c587f24922bc57b53cff54bc52
institution DOAJ
issn 2045-2322
language English
publishDate 2025-05-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-2b1602c587f24922bc57b53cff54bc522025-08-20T03:22:09ZengNature PortfolioScientific Reports2045-23222025-05-0115111110.1038/s41598-025-03444-3Differentiability of voice disorders through explainable AIFatma Özcan0Biophysics Department in Faculty of Medicine, Kahramanmaras Sutcu Imam UniversityAbstract The voice can be affected by various types of pathology. The phoniatric medical examination is the acoustic analysis, which evaluates the characteristic parameters extracted from the vocal signal. Computer-assisted decision-making systems can help specialists to detect vocal pathologies using only the patient’s voice. In this study, transfer learning techniques are used to perform the acoustic analysis. Fine-tuned OpenL3 then predicts whether or not the signals contain a pathology by classifying them under 8 different pathologies. A publicly available dataset is used with the categories Hyperkinetic dysphonia, Hypokinetic dysphonia, reflux laryngitis vocal fold nodules, prolapse, glottic insufficiency and vocal fold paralysis in addition to the Healthy class. The results obtained are very convincing. The accuracy with OpenL3, using tranfer learning, was 99.44%. In addition, explainable decision support systems (XDSS) provide an in-depth understanding of the decision-making process. Obtaining an image resulting from the averaging of all the Occlusion Sensitivity maps will enable us to understand the spatio-temporal characteristics of the disordered voices used for classification. Thanks to explainability methods, a new term, the differentiability, can be discussed to explain the black-box operation of deep networks. For purposes of rapid diagnosis and prevention, this work could provide more detail on disordered voices by enabling a promising explainable diagnosis.https://doi.org/10.1038/s41598-025-03444-3Disordered voicesExplainable artificial intelligence (XAI)Mel spectrogramOpenL3Computer-aided diagnosis system
spellingShingle Fatma Özcan
Differentiability of voice disorders through explainable AI
Scientific Reports
Disordered voices
Explainable artificial intelligence (XAI)
Mel spectrogram
OpenL3
Computer-aided diagnosis system
title Differentiability of voice disorders through explainable AI
title_full Differentiability of voice disorders through explainable AI
title_fullStr Differentiability of voice disorders through explainable AI
title_full_unstemmed Differentiability of voice disorders through explainable AI
title_short Differentiability of voice disorders through explainable AI
title_sort differentiability of voice disorders through explainable ai
topic Disordered voices
Explainable artificial intelligence (XAI)
Mel spectrogram
OpenL3
Computer-aided diagnosis system
url https://doi.org/10.1038/s41598-025-03444-3
work_keys_str_mv AT fatmaozcan differentiabilityofvoicedisordersthroughexplainableai