Dual-stage gated segmented multimodal emotion recognition method
Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
POSTS&TELECOM PRESS Co., LTD
2025-06-01
|
| Series: | 智能科学与技术学报 |
| Subjects: | |
| Online Access: | http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202514/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849409202413371392 |
|---|---|
| author | MA Fei LI Shuzhi YANG Feixia XU Guangxian |
| author_facet | MA Fei LI Shuzhi YANG Feixia XU Guangxian |
| author_sort | MA Fei |
| collection | DOAJ |
| description | Multimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based dual-stage gated segmented multimodal emotion recognition method (DGM). DGM adopts a segmented fusion architecture was proposed, consisting of an interaction stage and a dual-stage gating stage. In the interaction stage, the OAGL fusion strategy was employed to model global-local cross-modal interactions, improving the efficiency of feature fusion. The dual-stage gating stage integrates local and global features was designed to fully utilize emotional information. Additionally, to resolve the misalignment of local temporal features across modalities, a scaled dot-product-based sequence alignment method was developed to enhance fusion accuracy. Experimental were conducted on three benchmark datasets (CMU-MOSI, CMU-MOSEI, and CH-SIMS), and the results demonstrate that DGM outperforms representative algorithms on multiple datasets, validating its ability to capture emotional details and its strong generalization capability. |
| format | Article |
| id | doaj-art-57bc078e25b14db284efc62363e4864b |
| institution | Kabale University |
| issn | 2096-6652 |
| language | zho |
| publishDate | 2025-06-01 |
| publisher | POSTS&TELECOM PRESS Co., LTD |
| record_format | Article |
| series | 智能科学与技术学报 |
| spelling | doaj-art-57bc078e25b14db284efc62363e4864b2025-08-20T03:35:34ZzhoPOSTS&TELECOM PRESS Co., LTD智能科学与技术学报2096-66522025-06-017257267117464470Dual-stage gated segmented multimodal emotion recognition methodMA FeiLI ShuzhiYANG FeixiaXU GuangxianMultimodal emotion recognition has broad applications in mental health detection and affective computing. However, most existing methods rely on either global or local features, neglecting the joint modeling of both, which limits emotion recognition performance. To address this, a Transformer-based dual-stage gated segmented multimodal emotion recognition method (DGM). DGM adopts a segmented fusion architecture was proposed, consisting of an interaction stage and a dual-stage gating stage. In the interaction stage, the OAGL fusion strategy was employed to model global-local cross-modal interactions, improving the efficiency of feature fusion. The dual-stage gating stage integrates local and global features was designed to fully utilize emotional information. Additionally, to resolve the misalignment of local temporal features across modalities, a scaled dot-product-based sequence alignment method was developed to enhance fusion accuracy. Experimental were conducted on three benchmark datasets (CMU-MOSI, CMU-MOSEI, and CH-SIMS), and the results demonstrate that DGM outperforms representative algorithms on multiple datasets, validating its ability to capture emotional details and its strong generalization capability.http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202514/Multimodal Emotion RecognitionScaled Dot-Product AttentionDual-Stage Gated FusionTransformer |
| spellingShingle | MA Fei LI Shuzhi YANG Feixia XU Guangxian Dual-stage gated segmented multimodal emotion recognition method 智能科学与技术学报 Multimodal Emotion Recognition Scaled Dot-Product Attention Dual-Stage Gated Fusion Transformer |
| title | Dual-stage gated segmented multimodal emotion recognition method |
| title_full | Dual-stage gated segmented multimodal emotion recognition method |
| title_fullStr | Dual-stage gated segmented multimodal emotion recognition method |
| title_full_unstemmed | Dual-stage gated segmented multimodal emotion recognition method |
| title_short | Dual-stage gated segmented multimodal emotion recognition method |
| title_sort | dual stage gated segmented multimodal emotion recognition method |
| topic | Multimodal Emotion Recognition Scaled Dot-Product Attention Dual-Stage Gated Fusion Transformer |
| url | http://www.cjist.com.cn/zh/article/doi/10.11959/j.issn.2096-6652.202514/ |
| work_keys_str_mv | AT mafei dualstagegatedsegmentedmultimodalemotionrecognitionmethod AT lishuzhi dualstagegatedsegmentedmultimodalemotionrecognitionmethod AT yangfeixia dualstagegatedsegmentedmultimodalemotionrecognitionmethod AT xuguangxian dualstagegatedsegmentedmultimodalemotionrecognitionmethod |