Attention-Based Transfer Learning for Efficient Obstructive Sleep Apnea (OSA) Classification on Snore Sound

Polysomnography (PSG) is currently the gold-standard technique for classifying sleep apnea disorders. Yet, it is costly and requires an expert to score the severity, making it impractical for self-screening and home use. Snore sound classification with Deep Learning (DL) is a promising approach and...

Full description

Saved in:
Bibliographic Details
Main Authors: Apichada Sillaparaya, Yuttapong Jiraraksopakun, Kosin Chamnongthai, Apichai Bhatranand
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11018873/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Polysomnography (PSG) is currently the gold-standard technique for classifying sleep apnea disorders. Yet, it is costly and requires an expert to score the severity, making it impractical for self-screening and home use. Snore sound classification with Deep Learning (DL) is a promising approach and has gained increasing interest due to its relationship with abnormal breathing conditions in both time and frequency domains. This study proposes an attention-based transfer learning model for non-invasive detection of obstructive sleep apnea (OSA) using audio signals. Mel-spectrograms and MFCC features were input into the MobileNetV3-Large to extract deep features. A modified SENet was implemented to provide suitable channel attention for the extracted features. The attention-based features are then classified into normal and abnormal snore events. The study evaluates model performance using 10-fold cross-validation on sound data of the 70 adult OSA patients of an open-source PSG-Audio dataset. Results show that utilizing both Mel-Spectrogram and MFCCs as input features significantly enhances classification performance compared to single-feature models. In addition, the MobileNetV3-Large with modified SENet significantly outperforms the combination of other pre-trained and attention mechanisms. Specifically, the proposed model provides an accuracy of <inline-formula> <tex-math notation="LaTeX">$92.576\pm 0.910$ </tex-math></inline-formula>%, a sensitivity of <inline-formula> <tex-math notation="LaTeX">$92.906\pm 1.928$ </tex-math></inline-formula>%, a specificity of <inline-formula> <tex-math notation="LaTeX">$92.269\pm 2.740$ </tex-math></inline-formula>%, a precision of <inline-formula> <tex-math notation="LaTeX">$92.173\pm 3.024$ </tex-math></inline-formula>%, and an F1-score of <inline-formula> <tex-math notation="LaTeX">$92.486\pm 1.326$ </tex-math></inline-formula>% on the binary classification. Its performance also shows statistically significant improvement when benchmarked with other existing OSA classification models. Our proposed model demonstrates suitable potential for portable device-based sleep apnea monitoring applications.
ISSN:2169-3536