News Image Captioning via Separate Attention on Entity Categories
News image captioning involves generating descriptive and informative captions for news images by utilizing news article context. This task aims to capture detailed information, including multiple types of named entities like person, organization, location, events etc. However, identifying named ent...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11048780/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | News image captioning involves generating descriptive and informative captions for news images by utilizing news article context. This task aims to capture detailed information, including multiple types of named entities like person, organization, location, events etc. However, identifying named entities from an image is a challenging task; to address this, we propose a novel approach of categorizing the key named entities, into person entities and geoOrg entities, and providing distinct attention to these categories, which ensure the focused extraction of relevant information from the image. Despite this approach, a single news image falls short of providing comprehensive background details, leading to a lack of useful context. Nevertheless, it is possible that the missing context in one image is present in another image of the same article. To address this, we propose to incorporate the nearby image as an addon input, as structural proximity implies contextual relevance. This multimodal cue facilitates the accumulation of contextualized features that effectively capture contextually rich information from the article. Experimental results demonstrate the effectiveness of our proposed approach in the GoodNews, NYTimes800k and DM800K datasets for news image captioning, achieving an improvement of 0.4 BLEU-4 score over the state-of-the-art on the DM800K dataset. |
|---|---|
| ISSN: | 2169-3536 |