Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search

Every minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When properly ana...

Full description

Saved in:
Bibliographic Details
Main Authors: Pranita P. Deshmukh, S. Poonkuntran
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Computers
Subjects:
Online Access:https://www.mdpi.com/2073-431X/14/7/255
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850076988334669824
author Pranita P. Deshmukh
S. Poonkuntran
author_facet Pranita P. Deshmukh
S. Poonkuntran
author_sort Pranita P. Deshmukh
collection DOAJ
description Every minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When properly analyzed, this multimodal data holds immense potential for reconstructing important events and verifying information. However, challenges arise when images and videos lack complete annotations, making manual examination inefficient and time-consuming. To address this, we propose a novel event-based focal visual content text attention (EFVCTA) framework for automated past event retrieval using visual question answering (VQA) techniques. Our approach integrates a Long Short-Term Memory (LSTM) model with convolutional non-linearity and an adaptive attention mechanism to efficiently identify and retrieve relevant visual evidence alongside precise answers. The model is designed with robust weight initialization, regularization, and optimization strategies and is evaluated on the Common Objects in Context (COCO) dataset. The results demonstrate that EFVCTA achieves the highest performance across all metrics (88.7% accuracy, 86.5% F1-score, 84.9% mAP), outperforming state-of-the-art baselines. The EFVCTA framework demonstrates promising results for retrieving information about past events captured in images and videos and can be effectively applied to scenarios such as documenting training programs, workshops, conferences, and social gatherings in academic institutions
format Article
id doaj-art-1b5cc95e26744fbfaadd968dce5aea68
institution DOAJ
issn 2073-431X
language English
publishDate 2025-06-01
publisher MDPI AG
record_format Article
series Computers
spelling doaj-art-1b5cc95e26744fbfaadd968dce5aea682025-08-20T02:45:54ZengMDPI AGComputers2073-431X2025-06-0114725510.3390/computers14070255Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event SearchPranita P. Deshmukh0S. Poonkuntran1School of Computing Science & Engineering, VIT Bhopal University, Bhopal-Indore Highway Kothrikalan, Sehore 466114, Madhya Pradesh, IndiaSchool of Computing Science & Engineering, VIT Bhopal University, Bhopal-Indore Highway Kothrikalan, Sehore 466114, Madhya Pradesh, IndiaEvery minute, vast amounts of video and image data are uploaded worldwide to the internet and social media platforms, creating a rich visual archive of human experiences—from weddings and family gatherings to significant historical events such as war crimes and humanitarian crises. When properly analyzed, this multimodal data holds immense potential for reconstructing important events and verifying information. However, challenges arise when images and videos lack complete annotations, making manual examination inefficient and time-consuming. To address this, we propose a novel event-based focal visual content text attention (EFVCTA) framework for automated past event retrieval using visual question answering (VQA) techniques. Our approach integrates a Long Short-Term Memory (LSTM) model with convolutional non-linearity and an adaptive attention mechanism to efficiently identify and retrieve relevant visual evidence alongside precise answers. The model is designed with robust weight initialization, regularization, and optimization strategies and is evaluated on the Common Objects in Context (COCO) dataset. The results demonstrate that EFVCTA achieves the highest performance across all metrics (88.7% accuracy, 86.5% F1-score, 84.9% mAP), outperforming state-of-the-art baselines. The EFVCTA framework demonstrates promising results for retrieving information about past events captured in images and videos and can be effectively applied to scenarios such as documenting training programs, workshops, conferences, and social gatherings in academic institutionshttps://www.mdpi.com/2073-431X/14/7/255convolutional layer non-linearityevent-based focal visual content text attentionfocal correlationlong short-term memorypast event searchvisual content text attention
spellingShingle Pranita P. Deshmukh
S. Poonkuntran
Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
Computers
convolutional layer non-linearity
event-based focal visual content text attention
focal correlation
long short-term memory
past event search
visual content text attention
title Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
title_full Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
title_fullStr Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
title_full_unstemmed Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
title_short Focal Correlation and Event-Based Focal Visual Content Text Attention for Past Event Search
title_sort focal correlation and event based focal visual content text attention for past event search
topic convolutional layer non-linearity
event-based focal visual content text attention
focal correlation
long short-term memory
past event search
visual content text attention
url https://www.mdpi.com/2073-431X/14/7/255
work_keys_str_mv AT pranitapdeshmukh focalcorrelationandeventbasedfocalvisualcontenttextattentionforpasteventsearch
AT spoonkuntran focalcorrelationandeventbasedfocalvisualcontenttextattentionforpasteventsearch