Dual-stage gated segmented multimodal emotion recognition method

Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based...

Full description

Saved in:
Bibliographic Details
Main Authors: MA Fei, LI Shuzhi, YANG Feixia, XU Guangxian
Format: Article
Language:zho
Published: POSTS&TELECOM PRESS Co., LTD 2025-06-01
Series:智能科学与技术学报
Subjects:
Online Access:http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202514/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849409202413371392
author MA Fei
LI Shuzhi
YANG Feixia
XU Guangxian
author_facet MA Fei
LI Shuzhi
YANG Feixia
XU Guangxian
author_sort MA Fei
collection DOAJ
description Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based dual-stage gated segmented multimodal emotion recognition method (DGM). DGM adopts a segmented fusion architecture was proposed, consisting of an interaction stage and a dual-stage gating stage. In the interaction stage, the OAGL fusion strategy was employed to model global-local cross-modal interactions, improving the efficiency of feature fusion. The dual-stage gating stage integrates local and global features was designed to fully utilize emotional information. Additionally, to resolve the misalignment of local temporal features across modalities, a scaled dot-product-based sequence alignment method was developed to enhance fusion accuracy. Experimental were conducted on three benchmark datasets (CMU-MOSI, CMU-MOSEI, and CH-SIMS), and the results demonstrate that DGM outperforms representative algorithms on multiple datasets, validating its ability to capture emotional details and its strong generalization capability.
format Article
id doaj-art-57bc078e25b14db284efc62363e4864b
institution Kabale University
issn 2096-6652
language zho
publishDate 2025-06-01
publisher POSTS&TELECOM PRESS Co., LTD
record_format Article
series 智能科学与技术学报
spelling doaj-art-57bc078e25b14db284efc62363e4864b2025-08-20T03:35:34ZzhoPOSTS&TELECOM PRESS Co., LTD智能科学与技术学报2096-66522025-06-017257267117464470Dual-stage gated segmented multimodal emotion recognition methodMA FeiLI ShuzhiYANG FeixiaXU GuangxianMultimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based dual-stage gated segmented multimodal emotion recognition method (DGM). DGM adopts a segmented fusion architecture was proposed, consisting of an interaction stage and a dual-stage gating stage. In the interaction stage, the OAGL fusion strategy was employed to model global-local cross-modal interactions, improving the efficiency of feature fusion. The dual-stage gating stage integrates local and global features was designed to fully utilize emotional information. Additionally, to resolve the misalignment of local temporal features across modalities, a scaled dot-product-based sequence alignment method was developed to enhance fusion accuracy. Experimental were conducted on three benchmark datasets (CMU-MOSI, CMU-MOSEI, and CH-SIMS), and the results demonstrate that DGM outperforms representative algorithms on multiple datasets, validating its ability to capture emotional details and its strong generalization capability.http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202514/Multimodal Emotion RecognitionScaled Dot-Product AttentionDual-Stage Gated FusionTransformer
spellingShingle MA Fei
LI Shuzhi
YANG Feixia
XU Guangxian
Dual-stage gated segmented multimodal emotion recognition method
智能科学与技术学报
Multimodal Emotion Recognition
Scaled Dot-Product Attention
Dual-Stage Gated Fusion
Transformer
title Dual-stage gated segmented multimodal emotion recognition method
title_full Dual-stage gated segmented multimodal emotion recognition method
title_fullStr Dual-stage gated segmented multimodal emotion recognition method
title_full_unstemmed Dual-stage gated segmented multimodal emotion recognition method
title_short Dual-stage gated segmented multimodal emotion recognition method
title_sort dual stage gated segmented multimodal emotion recognition method
topic Multimodal Emotion Recognition
Scaled Dot-Product Attention
Dual-Stage Gated Fusion
Transformer
url http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202514/
work_keys_str_mv AT mafei dualstagegatedsegmentedmultimodalemotionrecognitionmethod
AT lishuzhi dualstagegatedsegmentedmultimodalemotionrecognitionmethod
AT yangfeixia dualstagegatedsegmentedmultimodalemotionrecognitionmethod
AT xuguangxian dualstagegatedsegmentedmultimodalemotionrecognitionmethod