Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
Every minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When properly ana...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Computers |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2073-431X/14/7/255 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850076988334669824 |
|---|---|
| author | Pranita P. Deshmukh S. Poonkuntran |
| author_facet | Pranita P. Deshmukh S. Poonkuntran |
| author_sort | Pranita P. Deshmukh |
| collection | DOAJ |
| description | Every minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When properly analyzed, this multimodal data holds immense potential for reconstructing important events and verifying information. However, challenges arise when images and videos lack complete annotations, making manual examination inefficient and time-consuming. To address this, we propose a novel event-based focal visual content text attention (EFVCTA) framework for automated past event retrieval using visual question answering (VQA) techniques. Our approach integrates a Long Short-Term Memory (LSTM) model with convolutional non-linearity and an adaptive attention mechanism to efficiently identify and retrieve relevant visual evidence alongside precise answers. The model is designed with robust weight initialization, regularization, and optimization strategies and is evaluated on the Common Objects in Context (COCO) dataset. The results demonstrate that EFVCTA achieves the highest performance across all metrics (88.7% accuracy, 86.5% F1-score, 84.9% mAP), outperforming state-of-the-art baselines. The EFVCTA framework demonstrates promising results for retrieving information about past events captured in images and videos and can be effectively applied to scenarios such as documenting training programs, workshops, conferences, and social gatherings in academic institutions |
| format | Article |
| id | doaj-art-1b5cc95e26744fbfaadd968dce5aea68 |
| institution | DOAJ |
| issn | 2073-431X |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Computers |
| spelling | doaj-art-1b5cc95e26744fbfaadd968dce5aea682025-08-20T02:45:54ZengMDPI AGComputers2073-431X2025-06-0114725510.3390/computers14070255Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event SearchPranita P. Deshmukh0S. Poonkuntran1School of Computing Science & Engineering, VIT Bhopal University, Bhopal-Indore Highway Kothrikalan, Sehore 466114, Madhya Pradesh, IndiaSchool of Computing Science & Engineering, VIT Bhopal University, Bhopal-Indore Highway Kothrikalan, Sehore 466114, Madhya Pradesh, IndiaEvery minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When properly analyzed, this multimodal data holds immense potential for reconstructing important events and verifying information. However, challenges arise when images and videos lack complete annotations, making manual examination inefficient and time-consuming. To address this, we propose a novel event-based focal visual content text attention (EFVCTA) framework for automated past event retrieval using visual question answering (VQA) techniques. Our approach integrates a Long Short-Term Memory (LSTM) model with convolutional non-linearity and an adaptive attention mechanism to efficiently identify and retrieve relevant visual evidence alongside precise answers. The model is designed with robust weight initialization, regularization, and optimization strategies and is evaluated on the Common Objects in Context (COCO) dataset. The results demonstrate that EFVCTA achieves the highest performance across all metrics (88.7% accuracy, 86.5% F1-score, 84.9% mAP), outperforming state-of-the-art baselines. The EFVCTA framework demonstrates promising results for retrieving information about past events captured in images and videos and can be effectively applied to scenarios such as documenting training programs, workshops, conferences, and social gatherings in academic institutionshttps://www.mdpi.com/2073-431X/14/7/255convolutional layer non-linearityevent-based focal visual content text attentionfocal correlationlong short-term memorypast event searchvisual content text attention |
| spellingShingle | Pranita P. Deshmukh S. Poonkuntran Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search Computers convolutional layer non-linearity event-based focal visual content text attention focal correlation long short-term memory past event search visual content text attention |
| title | Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search |
| title_full | Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search |
| title_fullStr | Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search |
| title_full_unstemmed | Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search |
| title_short | Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search |
| title_sort | focal correlation and event based focal visual content text attention for past event search |
| topic | convolutional layer non-linearity event-based focal visual content text attention focal correlation long short-term memory past event search visual content text attention |
| url | https://www.mdpi.com/2073-431X/14/7/255 |
| work_keys_str_mv | AT pranitapdeshmukh focalcorrelationandeventbasedfocalvisualcontenttextattentionforpasteventsearch AT spoonkuntran focalcorrelationandeventbasedfocalvisualcontenttextattentionforpasteventsearch |