A novel fusion architecture for detecting Parkinson’s Disease using semi-supervised speech embeddings
We introduce a framework for screening Parkinson’s disease (PD) using English pangram utterances. Our dataset includes 1306 participants (392 with PD) from both home and clinical settings, covering diverse demographics (53.2% female). We used deep learning embeddings from Wav2Vec 2.0, WavLM, and Ima...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-06-01
|
| Series: | npj Parkinson's Disease |
| Online Access: | https://doi.org/10.1038/s41531-025-00956-7 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | We introduce a framework for screening Parkinson’s disease (PD) using English pangram utterances. Our dataset includes 1306 participants (392 with PD) from both home and clinical settings, covering diverse demographics (53.2% female). We used deep learning embeddings from Wav2Vec 2.0, WavLM, and ImageBind to capture speech dynamics indicative of PD. Our novel fusion model for PD classification aligns different speech embeddings into a cohesive feature space, outperforming baseline alternatives. In a stratified randomized split, the model achieved an AUROC of 88.9% and an accuracy of 85.7%. Statistical bias analysis showed equitable performance across sex, ethnicity, and age subgroups, with robustness across various disease durations and PD stages. Detailed error analysis revealed higher misclassification rates in specific age ranges for males and females, aligning with clinical insights. External testing yielded AUROCs of 82.1% and 78.4% on two clinical datasets, and an AUROC of 77.4% on an unseen general spontaneous English speech dataset, demonstrating versatility in natural speech analysis and potential for global accessibility and health equity. |
|---|---|
| ISSN: | 2373-8057 |