User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection
In order to address the pressing challenge posed by the proliferation of fake news in the digital age, we emphasize its profound and harmful impact on societal structures, including the misguidance of public opinion, the erosion of social trust, and the exacerbation of social polarization. Current f...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/14/7904 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849714509134954496 |
|---|---|
| author | Zepu Yi Chenxu Tang Songfeng Lu |
| author_facet | Zepu Yi Chenxu Tang Songfeng Lu |
| author_sort | Zepu Yi |
| collection | DOAJ |
| description | In order to address the pressing challenge posed by the proliferation of fake news in the digital age, we emphasize its profound and harmful impact on societal structures, including the misguidance of public opinion, the erosion of social trust, and the exacerbation of social polarization. Current fake news detection methods are largely limited to superficial text analysis or basic text–image integration, which face significant limitations in accurately identifying deceptive information. To bridge this gap, we propose the UC-CMAF framework, which comprehensively integrates news text, images, and user comments through an adaptive co-attention fusion mechanism. The UC-CMAF workflow consists of four key subprocesses: multimodal feature extraction, cross-modal adaptive collaborative attention fusion of news text and images, cross-modal attention fusion of user comments with news text and images, and finally, input of fusion features into a fake news detector. Specifically, we introduce multi-head cross-modal attention heatmaps and comment importance visualizations to provide interpretability support for the model’s predictions, revealing key semantic areas and user perspectives that influence judgments. Through the cross-modal adaptive collaborative attention mechanism, UC-CMAF achieves deep semantic alignment between news text and images and uses social signals from user comments to build an enhanced credibility evaluation path, offering a new paradigm for interpretable fake information detection. Experimental results demonstrate that UC-CMAF consistently outperforms 15 baseline models across two benchmark datasets, achieving F1 Scores of 0.894 and 0.909. These results validate the effectiveness of its adaptive cross-modal attention mechanism and the incorporation of user comments in enhancing both detection accuracy and interpretability. |
| format | Article |
| id | doaj-art-dcb1c7d025784dd3a04067096a6d041b |
| institution | DOAJ |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-dcb1c7d025784dd3a04067096a6d041b2025-08-20T03:13:41ZengMDPI AGApplied Sciences2076-34172025-07-011514790410.3390/app15147904User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News DetectionZepu Yi0Chenxu Tang1Songfeng Lu2School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, ChinaSchool of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, ChinaSchool of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, ChinaIn order to address the pressing challenge posed by the proliferation of fake news in the digital age, we emphasize its profound and harmful impact on societal structures, including the misguidance of public opinion, the erosion of social trust, and the exacerbation of social polarization. Current fake news detection methods are largely limited to superficial text analysis or basic text–image integration, which face significant limitations in accurately identifying deceptive information. To bridge this gap, we propose the UC-CMAF framework, which comprehensively integrates news text, images, and user comments through an adaptive co-attention fusion mechanism. The UC-CMAF workflow consists of four key subprocesses: multimodal feature extraction, cross-modal adaptive collaborative attention fusion of news text and images, cross-modal attention fusion of user comments with news text and images, and finally, input of fusion features into a fake news detector. Specifically, we introduce multi-head cross-modal attention heatmaps and comment importance visualizations to provide interpretability support for the model’s predictions, revealing key semantic areas and user perspectives that influence judgments. Through the cross-modal adaptive collaborative attention mechanism, UC-CMAF achieves deep semantic alignment between news text and images and uses social signals from user comments to build an enhanced credibility evaluation path, offering a new paradigm for interpretable fake information detection. Experimental results demonstrate that UC-CMAF consistently outperforms 15 baseline models across two benchmark datasets, achieving F1 Scores of 0.894 and 0.909. These results validate the effectiveness of its adaptive cross-modal attention mechanism and the incorporation of user comments in enhancing both detection accuracy and interpretability.https://www.mdpi.com/2076-3417/15/14/7904interpretable multimodal fusionfake news detectioncross-modal attentionuser commentssocial media |
| spellingShingle | Zepu Yi Chenxu Tang Songfeng Lu User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection Applied Sciences interpretable multimodal fusion fake news detection cross-modal attention user comments social media |
| title | User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection |
| title_full | User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection |
| title_fullStr | User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection |
| title_full_unstemmed | User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection |
| title_short | User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection |
| title_sort | user comment guided cross modal attention for interpretable multimodal fake news detection |
| topic | interpretable multimodal fusion fake news detection cross-modal attention user comments social media |
| url | https://www.mdpi.com/2076-3417/15/14/7904 |
| work_keys_str_mv | AT zepuyi usercommentguidedcrossmodalattentionforinterpretablemultimodalfakenewsdetection AT chenxutang usercommentguidedcrossmodalattentionforinterpretablemultimodalfakenewsdetection AT songfenglu usercommentguidedcrossmodalattentionforinterpretablemultimodalfakenewsdetection |