HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis

In multimodal sentiment analysis, a significant challenge lies in quantifying the contribution of each modality and achieving effective modality fusion. This paper presents a Hierarchical Gating-Driven Transformer Fusion Model (HGTFM), which effectively achieves multimodal data fusion through an adv...

Full description

Saved in:
Bibliographic Details
Main Authors: Chengcheng Yang, Zhiyao Liang, Dashun Yan, Zeng Hu, Ting Wu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10965686/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850191497329115136
author Chengcheng Yang
Zhiyao Liang
Dashun Yan
Zeng Hu
Ting Wu
author_facet Chengcheng Yang
Zhiyao Liang
Dashun Yan
Zeng Hu
Ting Wu
author_sort Chengcheng Yang
collection DOAJ
description In multimodal sentiment analysis, a significant challenge lies in quantifying the contribution of each modality and achieving effective modality fusion. This paper presents a Hierarchical Gating-Driven Transformer Fusion Model (HGTFM), which effectively achieves multimodal data fusion through an advanced transformer architecture and a Hierarchical Gated Fusion Module (HGFM). The model first encodes audio and video features using LSTM and CNN, respectively, while employing BERT to encode text features. This approach effectively enhances the model ability to manage long-range temporal dependencies. Furthermore, the model introduces HGFM, which dynamically integrates multimodal features across different levels after processing by the modality interaction network (MIN), thus intelligently adjusting the contribution weights of each modality. In particular, the model incorporates a Unimodal Label Generation Module (ULGM), which considers the differences between the modalities. This enables HGFM to more accurately identify the contributions of samples with significant modal variations, thus optimizing the module potential. The experimental results on several established benchmark datasets demonstrate that HGTFM achieves comparable performance improvements in multimodal sentiment analysis tasks.
format Article
id doaj-art-36704224ddfa48ceb7b6ae64c35febab
institution OA Journals
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-36704224ddfa48ceb7b6ae64c35febab2025-08-20T02:14:54ZengIEEEIEEE Access2169-35362025-01-0113744307444510.1109/ACCESS.2025.356064110965686HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment AnalysisChengcheng Yang0https://orcid.org/0000-0003-4709-3024Zhiyao Liang1https://orcid.org/0000-0001-7192-8541Dashun Yan2https://orcid.org/0009-0002-7979-6592Zeng Hu3https://orcid.org/0000-0002-9376-7408Ting Wu4https://orcid.org/0000-0002-0417-5057School of Computer Science and Engineering, Macau University of Science and Technology, Taipa, MacauSchool of Computer Science and Engineering, Macau University of Science and Technology, Taipa, MacauSchool of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, ChinaSchool of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, ChinaSchool of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, ChinaIn multimodal sentiment analysis, a significant challenge lies in quantifying the contribution of each modality and achieving effective modality fusion. This paper presents a Hierarchical Gating-Driven Transformer Fusion Model (HGTFM), which effectively achieves multimodal data fusion through an advanced transformer architecture and a Hierarchical Gated Fusion Module (HGFM). The model first encodes audio and video features using LSTM and CNN, respectively, while employing BERT to encode text features. This approach effectively enhances the model ability to manage long-range temporal dependencies. Furthermore, the model introduces HGFM, which dynamically integrates multimodal features across different levels after processing by the modality interaction network (MIN), thus intelligently adjusting the contribution weights of each modality. In particular, the model incorporates a Unimodal Label Generation Module (ULGM), which considers the differences between the modalities. This enables HGFM to more accurately identify the contributions of samples with significant modal variations, thus optimizing the module potential. The experimental results on several established benchmark datasets demonstrate that HGTFM achieves comparable performance improvements in multimodal sentiment analysis tasks.https://ieeexplore.ieee.org/document/10965686/Multimodal sentiment analysishierarchical gated fusion modulemodality interaction networkunimodal label generation module
spellingShingle Chengcheng Yang
Zhiyao Liang
Dashun Yan
Zeng Hu
Ting Wu
HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis
IEEE Access
Multimodal sentiment analysis
hierarchical gated fusion module
modality interaction network
unimodal label generation module
title HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis
title_full HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis
title_fullStr HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis
title_full_unstemmed HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis
title_short HGTFM: Hierarchical Gating-Driven Transformer Fusion Model for Robust Multimodal Sentiment Analysis
title_sort hgtfm hierarchical gating driven transformer fusion model for robust multimodal sentiment analysis
topic Multimodal sentiment analysis
hierarchical gated fusion module
modality interaction network
unimodal label generation module
url https://ieeexplore.ieee.org/document/10965686/
work_keys_str_mv AT chengchengyang hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis
AT zhiyaoliang hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis
AT dashunyan hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis
AT zenghu hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis
AT tingwu hgtfmhierarchicalgatingdriventransformerfusionmodelforrobustmultimodalsentimentanalysis