Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks

This study aims to develop a hybridized deep learning model for generating semantically meaningful image captions in Amharic Language. Image captioning is a task that combines both computer vision and natural language processing (NLP) domains. However, existing studies in the English language primar...

Full description

Saved in:

Bibliographic Details
Main Authors:	Rodas Solomon, Mesfin Abebe
Format:	Article
Language:	English
Published:	Wiley 2023-01-01
Series:	Applied Computational Intelligence and Soft Computing
Online Access:	http://dx.doi.org/10.1155/2023/9397325
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850173760298024960
author	Rodas Solomon Mesfin Abebe
author_facet	Rodas Solomon Mesfin Abebe
author_sort	Rodas Solomon
collection	DOAJ
description	This study aims to develop a hybridized deep learning model for generating semantically meaningful image captions in Amharic Language. Image captioning is a task that combines both computer vision and natural language processing (NLP) domains. However, existing studies in the English language primarily focus on visual features to generate captions, resulting in a gap between visual and textual features and inadequate semantic representation. To address this challenge, this study proposes a hybridized attention-based deep neural network (DNN) model. The model consists of an Inception-v3 convolutional neural network (CNN) encoder to extract image features, a visual attention mechanism to capture significant features, and a bidirectional gated recurrent unit (Bi-GRU) with attention decoder to generate the image captions. The model was trained on the Flickr8k and BNATURE datasets with English captions, which were translated into Amharic Language with the help of Google Translator and Amharic Language experts. The evaluation of the model showed improvement in its performance, with a 1G-BLEU score of 60.6, a 2G-BLEU score of 50.1, a 3G-BLEU score of 43.7, and a 4G-BLEU score of 38.8. Generally, this study highlights the effectiveness of the hybrid approach in generating Amharic Language image captions with better semantic meaning.
format	Article
id	doaj-art-ca6b903e0d644cde8cb73025f4cb5a44
institution	OA Journals
issn	1687-9732
language	English
publishDate	2023-01-01
publisher	Wiley
record_format	Article
series	Applied Computational Intelligence and Soft Computing
spelling	doaj-art-ca6b903e0d644cde8cb73025f4cb5a442025-08-20T02:19:47ZengWileyApplied Computational Intelligence and Soft Computing1687-97322023-01-01202310.1155/2023/9397325Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural NetworksRodas Solomon0Mesfin Abebe1Department of Computer Science and EngineeringDepartment of Computer Science and EngineeringThis study aims to develop a hybridized deep learning model for generating semantically meaningful image captions in Amharic Language. Image captioning is a task that combines both computer vision and natural language processing (NLP) domains. However, existing studies in the English language primarily focus on visual features to generate captions, resulting in a gap between visual and textual features and inadequate semantic representation. To address this challenge, this study proposes a hybridized attention-based deep neural network (DNN) model. The model consists of an Inception-v3 convolutional neural network (CNN) encoder to extract image features, a visual attention mechanism to capture significant features, and a bidirectional gated recurrent unit (Bi-GRU) with attention decoder to generate the image captions. The model was trained on the Flickr8k and BNATURE datasets with English captions, which were translated into Amharic Language with the help of Google Translator and Amharic Language experts. The evaluation of the model showed improvement in its performance, with a 1G-BLEU score of 60.6, a 2G-BLEU score of 50.1, a 3G-BLEU score of 43.7, and a 4G-BLEU score of 38.8. Generally, this study highlights the effectiveness of the hybrid approach in generating Amharic Language image captions with better semantic meaning.http://dx.doi.org/10.1155/2023/9397325
spellingShingle	Rodas Solomon Mesfin Abebe Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks Applied Computational Intelligence and Soft Computing
title	Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks
title_full	Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks
title_fullStr	Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks
title_full_unstemmed	Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks
title_short	Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks
title_sort	amharic language image captions generation using hybridized attention based deep neural networks
url	http://dx.doi.org/10.1155/2023/9397325
work_keys_str_mv	AT rodassolomon amhariclanguageimagecaptionsgenerationusinghybridizedattentionbaseddeepneuralnetworks AT mesfinabebe amhariclanguageimagecaptionsgenerationusinghybridizedattentionbaseddeepneuralnetworks

Amharic Language Image Captions Generation Using Hybridized Attention-Based Deep Neural Networks

Similar Items