An Image-Text Sentiment Analysis Method Using Multi-Channel Multi-Modal Joint Learning

Multimodal sentiment analysis is a technical approach that integrates various modalities to analyze sentiment tendencies or emotional states. Existing challenges encountered by this approach include redundancy in independent modal features and a lack of correlation analysis between different modalit...

Full description

Saved in:
Bibliographic Details
Main Authors: Lianting Gong, Xingzhou He, Jianzhong Yang
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:Applied Artificial Intelligence
Online Access:https://www.tandfonline.com/doi/10.1080/08839514.2024.2371712
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multimodal sentiment analysis is a technical approach that integrates various modalities to analyze sentiment tendencies or emotional states. Existing challenges encountered by this approach include redundancy in independent modal features and a lack of correlation analysis between different modalities, causing insufficient fusion and degradation of result accuracy. To address these issues, this study proposes an innovative multi-channel multimodal joint learning method for image-text sentiment analysis. First, a multi-channel feature extraction module is introduced to comprehensively capture image or text features. Second, effective interaction of multimodal features is achieved by designing modality-wise interaction modules that eliminate redundant features through cross-modal cross-attention. Last, to consider the complementary role of contextual information in sentiment analysis, an adaptive multi-task fusion method is used to merge single-modal context features with multimodal features for enhancing the reliability of sentiment predictions. Experimental results demonstrate that the proposed method achieves an accuracy of 76.98% and 75.32% on the MVSA-Single and MVSA-Multiple datasets, with F1 scores of 76.23% and 75.29%, respectively, outperforming other state-of-the-art methods. This research provides new insights and methods for advancing multimodal feature fusion, enhancing the accuracy and practicality of sentiment analysis.
ISSN:0883-9514
1087-6545