RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition

We introduce a cutting-edge, GPT-assisted approach to optimizing visual scene retargeting by learning human gaze behaviors. RecompGPT leverages the power of Generative GPT to model human gaze dynamics, providing an advanced mechanism for intelligent and interactive image reconstruction. The framewor...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang Shang, Nassiriah Binti Shaari, Nur Sauri Bin Yahaya, Liu Hao
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	RecompGPT human gaze behavior GSP recomposition observer-like
Online Access:	https://ieeexplore.ieee.org/document/11078273/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849719453488513024
author	Wang Shang Nassiriah Binti Shaari Nur Sauri Bin Yahaya Liu Hao
author_facet	Wang Shang Nassiriah Binti Shaari Nur Sauri Bin Yahaya Liu Hao
author_sort	Wang Shang
collection	DOAJ
description	We introduce a cutting-edge, GPT-assisted approach to optimizing visual scene retargeting by learning human gaze behaviors. RecompGPT leverages the power of Generative GPT to model human gaze dynamics, providing an advanced mechanism for intelligent and interactive image reconstruction. The framework incorporates a hierarchical structure, utilizing the BING objectness metric to identify key visual elements in exhibits. Through the integration of multimodal data, RecompGPT employs a novel Locality-Preserved and Observer-Like Active Learning (LOAL) strategy to incrementally generate Gaze Shift Paths (GSPs). LOAL is an active-learning algorithm that selects multiple representative patches from each scene image. It simultaneously preserves the local distribution of image patches and chooses representative (or visually salient) regions in a way that mimics human gaze allocation. Then, we deploy GPT to learn the distribution of the initial human gaze fixation toward different sceneries. Afterward, these GSPs are refined via a multi-layer aggregation algorithm that encodes deep feature representations into a Gaussian Mixture Model (GMM) to model the distribution of human gaze patterns. The learned GMM guides the scene recomposition process by maximizing the posterior estimation. Empirical evaluations, including interactive user studies, demonstrate RecompGPT’s superiority over existing methods, achieving a 4.39% improvement in precision and reducing testing time by 50%. RecompGPT harmonizes cutting-edge algorithmic efficiency with human-centered aesthetics, pushing the boundaries of AI-driven scene analysis and visual recomposition. It enhances the interactivity and immersion of virtual reality, offering a next-generation approach to interactive scene recomposition that aligns with human gaze patterns and cognitive preferences.
format	Article
id	doaj-art-af262376e2cc47469b399c3c3ff6d573
institution	DOAJ
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-af262376e2cc47469b399c3c3ff6d5732025-08-20T03:12:09ZengIEEEIEEE Access2169-35362025-01-011312136212137710.1109/ACCESS.2025.358822111078273RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene RecompositionWang Shang0Nassiriah Binti Shaari1Nur Sauri Bin Yahaya2Liu Hao3https://orcid.org/0009-0005-5003-9039School of Multimedia Technologies and Communication, University Utara Malaysia, Sintok, MalaysiaSchool of Multimedia Technologies and Communication, University Utara Malaysia, Sintok, MalaysiaSchool of Multimedia Technologies and Communication, University Utara Malaysia, Sintok, MalaysiaDepartment of Medical Genetics, College of Basic Medicine and Forensic Medicine, Henan University of Science and Technology, Luoyang, ChinaWe introduce a cutting-edge, GPT-assisted approach to optimizing visual scene retargeting by learning human gaze behaviors. RecompGPT leverages the power of Generative GPT to model human gaze dynamics, providing an advanced mechanism for intelligent and interactive image reconstruction. The framework incorporates a hierarchical structure, utilizing the BING objectness metric to identify key visual elements in exhibits. Through the integration of multimodal data, RecompGPT employs a novel Locality-Preserved and Observer-Like Active Learning (LOAL) strategy to incrementally generate Gaze Shift Paths (GSPs). LOAL is an active-learning algorithm that selects multiple representative patches from each scene image. It simultaneously preserves the local distribution of image patches and chooses representative (or visually salient) regions in a way that mimics human gaze allocation. Then, we deploy GPT to learn the distribution of the initial human gaze fixation toward different sceneries. Afterward, these GSPs are refined via a multi-layer aggregation algorithm that encodes deep feature representations into a Gaussian Mixture Model (GMM) to model the distribution of human gaze patterns. The learned GMM guides the scene recomposition process by maximizing the posterior estimation. Empirical evaluations, including interactive user studies, demonstrate RecompGPT’s superiority over existing methods, achieving a 4.39% improvement in precision and reducing testing time by 50%. RecompGPT harmonizes cutting-edge algorithmic efficiency with human-centered aesthetics, pushing the boundaries of AI-driven scene analysis and visual recomposition. It enhances the interactivity and immersion of virtual reality, offering a next-generation approach to interactive scene recomposition that aligns with human gaze patterns and cognitive preferences.https://ieeexplore.ieee.org/document/11078273/RecompGPThuman gaze behaviorGSPrecompositionobserver-like
spellingShingle	Wang Shang Nassiriah Binti Shaari Nur Sauri Bin Yahaya Liu Hao RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition IEEE Access RecompGPT human gaze behavior GSP recomposition observer-like
title	RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition
title_full	RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition
title_fullStr	RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition
title_full_unstemmed	RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition
title_short	RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition
title_sort	recompgpt generative pre trained transformers assisted interactive human gaze pattern learning and distribution modeling for scene recomposition
topic	RecompGPT human gaze behavior GSP recomposition observer-like
url	https://ieeexplore.ieee.org/document/11078273/
work_keys_str_mv	AT wangshang recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition AT nassiriahbintishaari recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition AT nursauribinyahaya recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition AT liuhao recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition

RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition

Similar Items