Multimodal emotion recognition method in complex dynamic scenes
Multimodal emotion recognition technology leverages the power of deep learning to address advanced visual and emotional tasks. While generic deep networks can handle simple emotion recognition tasks, their generalization capability in complex and noisy environments, such as multi-scene outdoor setti...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
KeAi Communications Co., Ltd.
2025-05-01
|
| Series: | Journal of Information and Intelligence |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2949715925000046 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849433355969363968 |
|---|---|
| author | Long Liu Qingquan Luo Wenbo Zhang Mengxuan Zhang Bowen Zhai |
| author_facet | Long Liu Qingquan Luo Wenbo Zhang Mengxuan Zhang Bowen Zhai |
| author_sort | Long Liu |
| collection | DOAJ |
| description | Multimodal emotion recognition technology leverages the power of deep learning to address advanced visual and emotional tasks. While generic deep networks can handle simple emotion recognition tasks, their generalization capability in complex and noisy environments, such as multi-scene outdoor settings, remains limited. To overcome these challenges, this paper proposes a novel multimodal emotion recognition framework. First, we develop a robust network architecture based on the T5-small model, designed for dynamic-static fusion in complex scenarios, effectively mitigating the impact of noise. Second, we introduce a dynamic-static cross fusion network (D-SCFN) to enhance the integration and extraction of dynamic and static information, embedding it seamlessly within the T5 framework. Finally, we design and evaluate three distinct multi-task analysis frameworks to explore dependencies among tasks. The experimental results demonstrate that our model significantly outperforms other existing models, showcasing exceptional stability and remarkable adaptability to complex and dynamic scenarios. |
| format | Article |
| id | doaj-art-08b67d6e0c524fcf919dab65d31fee3d |
| institution | Kabale University |
| issn | 2949-7159 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | KeAi Communications Co., Ltd. |
| record_format | Article |
| series | Journal of Information and Intelligence |
| spelling | doaj-art-08b67d6e0c524fcf919dab65d31fee3d2025-08-20T03:27:05ZengKeAi Communications Co., Ltd.Journal of Information and Intelligence2949-71592025-05-013325727410.1016/j.jiixd.2025.02.004Multimodal emotion recognition method in complex dynamic scenesLong Liu0Qingquan Luo1Wenbo Zhang2Mengxuan Zhang3Bowen Zhai4School of Electronic Engineering, Xidian University, Xi'an 710071, China; Corresponding author.School of Electronic Engineering, Xidian University, Xi'an 710071, ChinaSchool of Electronic Engineering, Xidian University, Xi'an 710071, ChinaSchool of Artificial Intelligence, Xidian University, Xi'an 710071, ChinaSchool of Electronic Engineering, Xidian University, Xi'an 710071, ChinaMultimodal emotion recognition technology leverages the power of deep learning to address advanced visual and emotional tasks. While generic deep networks can handle simple emotion recognition tasks, their generalization capability in complex and noisy environments, such as multi-scene outdoor settings, remains limited. To overcome these challenges, this paper proposes a novel multimodal emotion recognition framework. First, we develop a robust network architecture based on the T5-small model, designed for dynamic-static fusion in complex scenarios, effectively mitigating the impact of noise. Second, we introduce a dynamic-static cross fusion network (D-SCFN) to enhance the integration and extraction of dynamic and static information, embedding it seamlessly within the T5 framework. Finally, we design and evaluate three distinct multi-task analysis frameworks to explore dependencies among tasks. The experimental results demonstrate that our model significantly outperforms other existing models, showcasing exceptional stability and remarkable adaptability to complex and dynamic scenarios.http://www.sciencedirect.com/science/article/pii/S2949715925000046Multimodal sentiment recognitionAttention mechanismsContrastive learningMultitask analysis introduction |
| spellingShingle | Long Liu Qingquan Luo Wenbo Zhang Mengxuan Zhang Bowen Zhai Multimodal emotion recognition method in complex dynamic scenes Journal of Information and Intelligence Multimodal sentiment recognition Attention mechanisms Contrastive learning Multitask analysis introduction |
| title | Multimodal emotion recognition method in complex dynamic scenes |
| title_full | Multimodal emotion recognition method in complex dynamic scenes |
| title_fullStr | Multimodal emotion recognition method in complex dynamic scenes |
| title_full_unstemmed | Multimodal emotion recognition method in complex dynamic scenes |
| title_short | Multimodal emotion recognition method in complex dynamic scenes |
| title_sort | multimodal emotion recognition method in complex dynamic scenes |
| topic | Multimodal sentiment recognition Attention mechanisms Contrastive learning Multitask analysis introduction |
| url | http://www.sciencedirect.com/science/article/pii/S2949715925000046 |
| work_keys_str_mv | AT longliu multimodalemotionrecognitionmethodincomplexdynamicscenes AT qingquanluo multimodalemotionrecognitionmethodincomplexdynamicscenes AT wenbozhang multimodalemotionrecognitionmethodincomplexdynamicscenes AT mengxuanzhang multimodalemotionrecognitionmethodincomplexdynamicscenes AT bowenzhai multimodalemotionrecognitionmethodincomplexdynamicscenes |