BeliN: A novel corpus for Bengali religious news headline generation using contextual feature fusion

Automatic text summarization, particularly headline generation, remains a critical yet under-explored area for Bengali religious news. Existing approaches to headline generation typically rely solely on the article content, overlooking crucial contextual features such as sentiment, category, and asp...

Full description

Saved in:

Bibliographic Details
Main Authors:	Md Osama, Ashim Dey, Kawsar Ahmed, Muhammad Ashad Kabir
Format:	Article
Language:	English
Published:	Elsevier 2025-06-01
Series:	Natural Language Processing Journal
Subjects:	Bengali Headline generation Religious News article Feature fusion Aspect
Online Access:	http://www.sciencedirect.com/science/article/pii/S2949719125000147
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Automatic text summarization, particularly headline generation, remains a critical yet under-explored area for Bengali religious news. Existing approaches to headline generation typically rely solely on the article content, overlooking crucial contextual features such as sentiment, category, and aspect. This limitation significantly hinders their effectiveness and overall performance. This study addresses this limitation by introducing a novel corpus, BeliN (Bengali Religious News) – comprising religious news articles from prominent Bangladeshi online newspapers, and MultiGen – a contextual multi-input feature fusion headline generation approach. Leveraging transformer-based pre-trained language models such as BanglaT5, mBART, mT5, and mT0, MultiGen integrates additional contextual features – including category, aspect, and sentiment – with the news content. This fusion enables the model to capture critical contextual information often overlooked by traditional methods. Experimental results demonstrate the superiority of MultiGen over the baseline approach that uses only news content, achieving a BLEU score of 18.61 and ROUGE-L score of 24.19, compared to baseline approach scores of 16.08 and 23.08, respectively. These findings underscore the importance of incorporating contextual features in headline generation for low-resource languages. By bridging linguistic and cultural gaps, this research advances natural language processing for Bengali and other under-represented languages. To promote reproducibility and further exploration, the dataset and implementation code are publicly accessible at https://github.com/akabircs/BeliN.
ISSN:	2949-7191

BeliN: A novel corpus for Bengali religious news headline generation using contextual feature fusion

Similar Items