HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis
In multimodal sentiment analysis, a significant challenge lies in quantifying the contribution of each modality and achieving effective modality fusion. This paper presents a Hierarchical Gating-Driven Transformer Fusion Model (HGTFM), which effectively achieves multimodal data fusion through an adv...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10965686/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850191497329115136 |
|---|---|
| author | Chengcheng Yang Zhiyao Liang Dashun Yan Zeng Hu Ting Wu |
| author_facet | Chengcheng Yang Zhiyao Liang Dashun Yan Zeng Hu Ting Wu |
| author_sort | Chengcheng Yang |
| collection | DOAJ |
| description | In multimodal sentiment analysis, a significant challenge lies in quantifying the contribution of each modality and achieving effective modality fusion. This paper presents a Hierarchical Gating-Driven Transformer Fusion Model (HGTFM), which effectively achieves multimodal data fusion through an advanced transformer architecture and a Hierarchical Gated Fusion Module (HGFM). The model first encodes audio and video features using LSTM and CNN, respectively, while employing BERT to encode text features. This approach effectively enhances the model ability to manage long-range temporal dependencies. Furthermore, the model introduces HGFM, which dynamically integrates multimodal features across different levels after processing by the modality interaction network (MIN), thus intelligently adjusting the contribution weights of each modality. In particular, the model incorporates a Unimodal Label Generation Module (ULGM), which considers the differences between the modalities. This enables HGFM to more accurately identify the contributions of samples with significant modal variations, thus optimizing the module potential. The experimental results on several established benchmark datasets demonstrate that HGTFM achieves comparable performance improvements in multimodal sentiment analysis tasks. |
| format | Article |
| id | doaj-art-36704224ddfa48ceb7b6ae64c35febab |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-36704224ddfa48ceb7b6ae64c35febab2025-08-20T02:14:54ZengIEEEIEEE Access2169-35362025-01-0113744307444510.1109/ACCESS.2025.356064110965686HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment AnalysisChengcheng Yang0https://orcid.org/0000-0003-4709-3024Zhiyao Liang1https://orcid.org/0000-0001-7192-8541Dashun Yan2https://orcid.org/0009-0002-7979-6592Zeng Hu3https://orcid.org/0000-0002-9376-7408Ting Wu4https://orcid.org/0000-0002-0417-5057School of Computer Science and Engineering, Macau University of Science and Technology, Taipa, MacauSchool of Computer Science and Engineering, Macau University of Science and Technology, Taipa, MacauSchool of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, ChinaSchool of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, ChinaSchool of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, ChinaIn multimodal sentiment analysis, a significant challenge lies in quantifying the contribution of each modality and achieving effective modality fusion. This paper presents a Hierarchical Gating-Driven Transformer Fusion Model (HGTFM), which effectively achieves multimodal data fusion through an advanced transformer architecture and a Hierarchical Gated Fusion Module (HGFM). The model first encodes audio and video features using LSTM and CNN, respectively, while employing BERT to encode text features. This approach effectively enhances the model ability to manage long-range temporal dependencies. Furthermore, the model introduces HGFM, which dynamically integrates multimodal features across different levels after processing by the modality interaction network (MIN), thus intelligently adjusting the contribution weights of each modality. In particular, the model incorporates a Unimodal Label Generation Module (ULGM), which considers the differences between the modalities. This enables HGFM to more accurately identify the contributions of samples with significant modal variations, thus optimizing the module potential. The experimental results on several established benchmark datasets demonstrate that HGTFM achieves comparable performance improvements in multimodal sentiment analysis tasks.https://ieeexplore.ieee.org/document/10965686/Multimodal sentiment analysishierarchical gated fusion modulemodality interaction networkunimodal label generation module |
| spellingShingle | Chengcheng Yang Zhiyao Liang Dashun Yan Zeng Hu Ting Wu HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis IEEE Access Multimodal sentiment analysis hierarchical gated fusion module modality interaction network unimodal label generation module |
| title | HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis |
| title_full | HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis |
| title_fullStr | HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis |
| title_full_unstemmed | HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis |
| title_short | HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis |
| title_sort | hgtfm hierarchical gating driven transformer fusion model for robust multimodal sentiment analysis |
| topic | Multimodal sentiment analysis hierarchical gated fusion module modality interaction network unimodal label generation module |
| url | https://ieeexplore.ieee.org/document/10965686/ |
| work_keys_str_mv | AT chengchengyang hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis AT zhiyaoliang hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis AT dashunyan hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis AT zenghu hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis AT tingwu hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis |