From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> Platform
<b>Background/Objectives</b>: Sleep disorders are prevalent among aging populations and are often linked to cognitive decline, chronic conditions, and reduced quality of life. Traditional diagnostic methods, such as polysomnography, are resource-intensive and limited in accessibility. Me...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Brain Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3425/15/7/667 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849733093852708864 |
|---|---|
| author | Christos A. Frantzidis |
| author_facet | Christos A. Frantzidis |
| author_sort | Christos A. Frantzidis |
| collection | DOAJ |
| description | <b>Background/Objectives</b>: Sleep disorders are prevalent among aging populations and are often linked to cognitive decline, chronic conditions, and reduced quality of life. Traditional diagnostic methods, such as polysomnography, are resource-intensive and limited in accessibility. Meanwhile, individuals frequently describe their sleep experiences through unstructured narratives in clinical notes, online forums, and telehealth platforms. This study proposes a machine learning pipeline (<i><b>sleepCare</b></i>) that classifies sleep-related narratives into clinically meaningful categories, including stress-related, neurodegenerative, and breathing-related disorders. The proposed framework employs natural language processing (NLP) and machine learning techniques to support remote applications and real-time patient monitoring, offering a scalable solution for the early identification of sleep disturbances. <b>Methods</b>: The <i><b>sleepCare</b></i> consists of a three-tiered classification pipeline to analyze narrative sleep reports. First, a baseline model used a Multinomial Naïve Bayes classifier with n-gram features from a Bag-of-Words representation. Next, a Support Vector Machine (SVM) was trained on GloVe-based word embeddings to capture semantic context. Finally, a transformer-based model (BERT) was fine-tuned to extract contextual embeddings, using the [CLS] token as input for SVM classification. Each model was evaluated using stratified train-test splits and 10-fold cross-validation. Hyperparameter tuning via GridSearchCV optimized performance. The dataset contained 475 labeled sleep narratives, classified into five etiological categories relevant for clinical interpretation. <b>Results</b>: The transformer-based model utilizing BERT embeddings and an optimized Support Vector Machine classifier achieved an overall accuracy of <b>81%</b> on the test set. Class-wise F1-scores ranged from <b>0.72 to 0.91</b>, with the highest performance observed in classifying <b>normal or improved sleep</b> (F1 = 0.91). The <b>macro average F1-score</b> was <b>0.78</b>, indicating balanced performance across all categories. GridSearchCV identified the optimal SVM parameters (C = 4, kernel = ‘rbf’, gamma = 0.01, degree = 2, class_weight = ‘balanced’). The confusion matrix revealed robust classification with limited misclassifications, particularly between overlapping symptom categories such as stress-related and neurodegenerative sleep disturbances. <b>Conclusions</b>: Unlike generic large language model applications, our approach emphasizes the <b>personalized identification of sleep symptomatology</b> through targeted classification of the narrative input. By integrating structured learning with contextual embeddings, the framework offers a <b>clinically meaningful</b>, scalable solution for early detection and differentiation of sleep disorders in diverse, real-world, and remote settings. |
| format | Article |
| id | doaj-art-6bd066c8b05649aea051591cf4efdb3d |
| institution | DOAJ |
| issn | 2076-3425 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Brain Sciences |
| spelling | doaj-art-6bd066c8b05649aea051591cf4efdb3d2025-08-20T03:08:07ZengMDPI AGBrain Sciences2076-34252025-06-0115766710.3390/brainsci15070667From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> PlatformChristos A. Frantzidis0School of Engineering and Physical Sciences, University of Lincoln, Lincoln LN6 7TS, UK<b>Background/Objectives</b>: Sleep disorders are prevalent among aging populations and are often linked to cognitive decline, chronic conditions, and reduced quality of life. Traditional diagnostic methods, such as polysomnography, are resource-intensive and limited in accessibility. Meanwhile, individuals frequently describe their sleep experiences through unstructured narratives in clinical notes, online forums, and telehealth platforms. This study proposes a machine learning pipeline (<i><b>sleepCare</b></i>) that classifies sleep-related narratives into clinically meaningful categories, including stress-related, neurodegenerative, and breathing-related disorders. The proposed framework employs natural language processing (NLP) and machine learning techniques to support remote applications and real-time patient monitoring, offering a scalable solution for the early identification of sleep disturbances. <b>Methods</b>: The <i><b>sleepCare</b></i> consists of a three-tiered classification pipeline to analyze narrative sleep reports. First, a baseline model used a Multinomial Naïve Bayes classifier with n-gram features from a Bag-of-Words representation. Next, a Support Vector Machine (SVM) was trained on GloVe-based word embeddings to capture semantic context. Finally, a transformer-based model (BERT) was fine-tuned to extract contextual embeddings, using the [CLS] token as input for SVM classification. Each model was evaluated using stratified train-test splits and 10-fold cross-validation. Hyperparameter tuning via GridSearchCV optimized performance. The dataset contained 475 labeled sleep narratives, classified into five etiological categories relevant for clinical interpretation. <b>Results</b>: The transformer-based model utilizing BERT embeddings and an optimized Support Vector Machine classifier achieved an overall accuracy of <b>81%</b> on the test set. Class-wise F1-scores ranged from <b>0.72 to 0.91</b>, with the highest performance observed in classifying <b>normal or improved sleep</b> (F1 = 0.91). The <b>macro average F1-score</b> was <b>0.78</b>, indicating balanced performance across all categories. GridSearchCV identified the optimal SVM parameters (C = 4, kernel = ‘rbf’, gamma = 0.01, degree = 2, class_weight = ‘balanced’). The confusion matrix revealed robust classification with limited misclassifications, particularly between overlapping symptom categories such as stress-related and neurodegenerative sleep disturbances. <b>Conclusions</b>: Unlike generic large language model applications, our approach emphasizes the <b>personalized identification of sleep symptomatology</b> through targeted classification of the narrative input. By integrating structured learning with contextual embeddings, the framework offers a <b>clinically meaningful</b>, scalable solution for early detection and differentiation of sleep disorders in diverse, real-world, and remote settings.https://www.mdpi.com/2076-3425/15/7/667agingBERTmachine learningnarrative analysisnatural language processing (NLP)sleep disorders |
| spellingShingle | Christos A. Frantzidis From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> Platform Brain Sciences aging BERT machine learning narrative analysis natural language processing (NLP) sleep disorders |
| title | From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> Platform |
| title_full | From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> Platform |
| title_fullStr | From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> Platform |
| title_full_unstemmed | From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> Platform |
| title_short | From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The <i>sleepCare</i> Platform |
| title_sort | from narratives to diagnosis a machine learning framework for classifying sleep disorders in aging populations the i sleepcare i platform |
| topic | aging BERT machine learning narrative analysis natural language processing (NLP) sleep disorders |
| url | https://www.mdpi.com/2076-3425/15/7/667 |
| work_keys_str_mv | AT christosafrantzidis fromnarrativestodiagnosisamachinelearningframeworkforclassifyingsleepdisordersinagingpopulationstheisleepcareiplatform |