Detailed Image Captioning and Hashtag Generation
This article presents CapFlow, an integrated approach to detailed image captioning and hashtag generation. Based on a thorough performance evaluation, the image captioning model utilizes a fine-tuned vision-language model with Low-Rank Adaptation (LoRA), while the hashtag generation employs the keyw...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-11-01
|
| Series: | Future Internet |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1999-5903/16/12/444 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850050340705009664 |
|---|---|
| author | Nikshep Shetty Yongmin Li |
| author_facet | Nikshep Shetty Yongmin Li |
| author_sort | Nikshep Shetty |
| collection | DOAJ |
| description | This article presents CapFlow, an integrated approach to detailed image captioning and hashtag generation. Based on a thorough performance evaluation, the image captioning model utilizes a fine-tuned vision-language model with Low-Rank Adaptation (LoRA), while the hashtag generation employs the keyword extraction method. We evaluated the state-of-the-art image captioning models using both traditional metrics (BLEU, METEOR, ROUGE-L, and CIDEr) and the specialized CAPTURE metric for detailed captions. The hashtag generation models were assessed using precision, recall, and F1-score. The proposed method demonstrates competitive results against larger models while maintaining efficiency suitable for real-time applications. The image captioning model outperforms the base Florence-2 model and favorably compares with larger models. The KeyBERT implementation for hashtag generation surpasses other keyword extraction methods in both accuracy and speed. This work contributes to the field of AI-assisted content analysis and generation, offering insights into the practical implementation of advanced vision-language models for detailed image understanding and relevant tag generation. |
| format | Article |
| id | doaj-art-5086accdf6444d72b200aa61c7ab0055 |
| institution | DOAJ |
| issn | 1999-5903 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Future Internet |
| spelling | doaj-art-5086accdf6444d72b200aa61c7ab00552025-08-20T02:53:30ZengMDPI AGFuture Internet1999-59032024-11-01161244410.3390/fi16120444Detailed Image Captioning and Hashtag GenerationNikshep Shetty0Yongmin Li1Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, UKDepartment of Computer Science, Brunel University London, Uxbridge UB8 3PH, UKThis article presents CapFlow, an integrated approach to detailed image captioning and hashtag generation. Based on a thorough performance evaluation, the image captioning model utilizes a fine-tuned vision-language model with Low-Rank Adaptation (LoRA), while the hashtag generation employs the keyword extraction method. We evaluated the state-of-the-art image captioning models using both traditional metrics (BLEU, METEOR, ROUGE-L, and CIDEr) and the specialized CAPTURE metric for detailed captions. The hashtag generation models were assessed using precision, recall, and F1-score. The proposed method demonstrates competitive results against larger models while maintaining efficiency suitable for real-time applications. The image captioning model outperforms the base Florence-2 model and favorably compares with larger models. The KeyBERT implementation for hashtag generation surpasses other keyword extraction methods in both accuracy and speed. This work contributes to the field of AI-assisted content analysis and generation, offering insights into the practical implementation of advanced vision-language models for detailed image understanding and relevant tag generation.https://www.mdpi.com/1999-5903/16/12/444image captioninghashtag generationvision-language modelsAI-assisted content analysis |
| spellingShingle | Nikshep Shetty Yongmin Li Detailed Image Captioning and Hashtag Generation Future Internet image captioning hashtag generation vision-language models AI-assisted content analysis |
| title | Detailed Image Captioning and Hashtag Generation |
| title_full | Detailed Image Captioning and Hashtag Generation |
| title_fullStr | Detailed Image Captioning and Hashtag Generation |
| title_full_unstemmed | Detailed Image Captioning and Hashtag Generation |
| title_short | Detailed Image Captioning and Hashtag Generation |
| title_sort | detailed image captioning and hashtag generation |
| topic | image captioning hashtag generation vision-language models AI-assisted content analysis |
| url | https://www.mdpi.com/1999-5903/16/12/444 |
| work_keys_str_mv | AT nikshepshetty detailedimagecaptioningandhashtaggeneration AT yongminli detailedimagecaptioningandhashtaggeneration |