Sleep Staging Using Compressed Vision Transformer With Novel Two-Step Attention Weighted Sum

Automatic sleep staging is crucial for diagnosing sleep disorders, however, existing inter-epoch feature extraction schemes such as RNN-based networks or transformers often struggle with long sleep sequences due to overfitting. This study presents a novel automatic sleep staging method utilizing a p...

Full description

Saved in:
Bibliographic Details
Main Authors: Hyounggyu Kim, Moogyeong Kim, Wonzoo Chung
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10966867/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatic sleep staging is crucial for diagnosing sleep disorders, however, existing inter-epoch feature extraction schemes such as RNN-based networks or transformers often struggle with long sleep sequences due to overfitting. This study presents a novel automatic sleep staging method utilizing a pre-trained vision transformer with compression as a sequence encoder and a two-step attention to enhance the sleep-stage classification performance. In contrast to existing transformer-based methods, the pre-trained transformer with compression can handle long sequences covering a sleep cycle, leveraging robust feature extraction capabilities with substantially fewer parameters. Furthermore, an epoch encoder based on a bidirectional temporal convolutional network with a multi-head two-step attention mechanism is proposed to improve the efficiency of epoch-level feature extraction. The performance of the proposed method is evaluated using three publicly available datasets: SleepEDF-20, SleepEDF-78, and SHHS. Numerical experiments show notable performance enhancement of the proposed scheme in comparison with the state-of-the-art algorithms, particularly for small training datasets, which validates the resilience of the proposed method against overfitting. These results suggest that with appropriate regularization, transformer-based models can effectively capture long-term contextual information across a complete sleep cycle.
ISSN:2169-3536