MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition

Emotion analysis based on electroencephalogram (EEG) sensors is pivotal for human–machine interaction yet faces key challenges in spatio-temporal feature fusion and cross-band and brain-region integration from multi-channel sensor-derived signals. This paper proposes MB-MSTFNet, a novel framework fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Cheng Fang, Sitong Liu, Bing Gao
Format: Article
Language:English
Published: MDPI AG 2025-08-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/25/15/4819
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849406103096393728
author Cheng Fang
Sitong Liu
Bing Gao
author_facet Cheng Fang
Sitong Liu
Bing Gao
author_sort Cheng Fang
collection DOAJ
description Emotion analysis based on electroencephalogram (EEG) sensors is pivotal for human–machine interaction yet faces key challenges in spatio-temporal feature fusion and cross-band and brain-region integration from multi-channel sensor-derived signals. This paper proposes MB-MSTFNet, a novel framework for EEG emotion recognition. The model constructs a 3D tensor to encode band–space–time correlations of sensor data, explicitly modeling frequency-domain dynamics and spatial distributions of EEG sensors across brain regions. A multi-scale CNN-Inception module extracts hierarchical spatial features via diverse convolutional kernels and pooling operations, capturing localized sensor activations and global brain network interactions. Bi-directional GRUs (BiGRUs) model temporal dependencies in sensor time-series, adept at capturing long-range dynamic patterns. Multi-head self-attention highlights critical time windows and brain regions by assigning adaptive weights to relevant sensor channels, suppressing noise from non-contributory electrodes. Experiments on the DEAP dataset, containing multi-channel EEG sensor recordings, show that MB-MSTFNet achieves 96.80 ± 0.92% valence accuracy, 98.02 ± 0.76% arousal accuracy for binary classification tasks, and 92.85 ± 1.45% accuracy for four-class classification. Ablation studies validate that feature fusion, bidirectional temporal modeling, and multi-scale mechanisms significantly enhance performance by improving feature complementarity. This sensor-driven framework advances affective computing by integrating spatio-temporal dynamics and multi-band interactions of EEG sensor signals, enabling efficient real-time emotion recognition.
format Article
id doaj-art-a614d5071a3449d3b8f04dce1ce0f439
institution Kabale University
issn 1424-8220
language English
publishDate 2025-08-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-a614d5071a3449d3b8f04dce1ce0f4392025-08-20T03:36:30ZengMDPI AGSensors1424-82202025-08-012515481910.3390/s25154819MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion RecognitionCheng Fang0Sitong Liu1Bing Gao2Key Laboratory of Civil Aviation Thermal Hazards Prevention and Emergency Response, Civil Aviation University of China, Tianjin 300300, ChinaCollege of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, ChinaEngineering Techniques Training Center, Civil Aviation University of China, Tianjin 300300, ChinaEmotion analysis based on electroencephalogram (EEG) sensors is pivotal for human–machine interaction yet faces key challenges in spatio-temporal feature fusion and cross-band and brain-region integration from multi-channel sensor-derived signals. This paper proposes MB-MSTFNet, a novel framework for EEG emotion recognition. The model constructs a 3D tensor to encode band–space–time correlations of sensor data, explicitly modeling frequency-domain dynamics and spatial distributions of EEG sensors across brain regions. A multi-scale CNN-Inception module extracts hierarchical spatial features via diverse convolutional kernels and pooling operations, capturing localized sensor activations and global brain network interactions. Bi-directional GRUs (BiGRUs) model temporal dependencies in sensor time-series, adept at capturing long-range dynamic patterns. Multi-head self-attention highlights critical time windows and brain regions by assigning adaptive weights to relevant sensor channels, suppressing noise from non-contributory electrodes. Experiments on the DEAP dataset, containing multi-channel EEG sensor recordings, show that MB-MSTFNet achieves 96.80 ± 0.92% valence accuracy, 98.02 ± 0.76% arousal accuracy for binary classification tasks, and 92.85 ± 1.45% accuracy for four-class classification. Ablation studies validate that feature fusion, bidirectional temporal modeling, and multi-scale mechanisms significantly enhance performance by improving feature complementarity. This sensor-driven framework advances affective computing by integrating spatio-temporal dynamics and multi-band interactions of EEG sensor signals, enabling efficient real-time emotion recognition.https://www.mdpi.com/1424-8220/25/15/4819electroencephalograph (EEG)convolutional neural network (CNN)bidirectional gated recurrent unit (BiGRU)multi-head attention (MHA)emotion signal recognitionInception module
spellingShingle Cheng Fang
Sitong Liu
Bing Gao
MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
Sensors
electroencephalograph (EEG)
convolutional neural network (CNN)
bidirectional gated recurrent unit (BiGRU)
multi-head attention (MHA)
emotion signal recognition
Inception module
title MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
title_full MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
title_fullStr MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
title_full_unstemmed MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
title_short MB-MSTFNet: A Multi-Band Spatio-Temporal Attention Network for EEG Sensor-Based Emotion Recognition
title_sort mb mstfnet a multi band spatio temporal attention network for eeg sensor based emotion recognition
topic electroencephalograph (EEG)
convolutional neural network (CNN)
bidirectional gated recurrent unit (BiGRU)
multi-head attention (MHA)
emotion signal recognition
Inception module
url https://www.mdpi.com/1424-8220/25/15/4819
work_keys_str_mv AT chengfang mbmstfnetamultibandspatiotemporalattentionnetworkforeegsensorbasedemotionrecognition
AT sitongliu mbmstfnetamultibandspatiotemporalattentionnetworkforeegsensorbasedemotionrecognition
AT binggao mbmstfnetamultibandspatiotemporalattentionnetworkforeegsensorbasedemotionrecognition