RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition
We introduce a cutting-edge, GPT-assisted approach to optimizing visual scene retargeting by learning human gaze behaviors. RecompGPT leverages the power of Generative GPT to model human gaze dynamics, providing an advanced mechanism for intelligent and interactive image reconstruction. The framewor...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11078273/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849719453488513024 |
|---|---|
| author | Wang Shang Nassiriah Binti Shaari Nur Sauri Bin Yahaya Liu Hao |
| author_facet | Wang Shang Nassiriah Binti Shaari Nur Sauri Bin Yahaya Liu Hao |
| author_sort | Wang Shang |
| collection | DOAJ |
| description | We introduce a cutting-edge, GPT-assisted approach to optimizing visual scene retargeting by learning human gaze behaviors. RecompGPT leverages the power of Generative GPT to model human gaze dynamics, providing an advanced mechanism for intelligent and interactive image reconstruction. The framework incorporates a hierarchical structure, utilizing the BING objectness metric to identify key visual elements in exhibits. Through the integration of multimodal data, RecompGPT employs a novel Locality-Preserved and Observer-Like Active Learning (LOAL) strategy to incrementally generate Gaze Shift Paths (GSPs). LOAL is an active-learning algorithm that selects multiple representative patches from each scene image. It simultaneously preserves the local distribution of image patches and chooses representative (or visually salient) regions in a way that mimics human gaze allocation. Then, we deploy GPT to learn the distribution of the initial human gaze fixation toward different sceneries. Afterward, these GSPs are refined via a multi-layer aggregation algorithm that encodes deep feature representations into a Gaussian Mixture Model (GMM) to model the distribution of human gaze patterns. The learned GMM guides the scene recomposition process by maximizing the posterior estimation. Empirical evaluations, including interactive user studies, demonstrate RecompGPT’s superiority over existing methods, achieving a 4.39% improvement in precision and reducing testing time by 50%. RecompGPT harmonizes cutting-edge algorithmic efficiency with human-centered aesthetics, pushing the boundaries of AI-driven scene analysis and visual recomposition. It enhances the interactivity and immersion of virtual reality, offering a next-generation approach to interactive scene recomposition that aligns with human gaze patterns and cognitive preferences. |
| format | Article |
| id | doaj-art-af262376e2cc47469b399c3c3ff6d573 |
| institution | DOAJ |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-af262376e2cc47469b399c3c3ff6d5732025-08-20T03:12:09ZengIEEEIEEE Access2169-35362025-01-011312136212137710.1109/ACCESS.2025.358822111078273RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene RecompositionWang Shang0Nassiriah Binti Shaari1Nur Sauri Bin Yahaya2Liu Hao3https://orcid.org/0009-0005-5003-9039School of Multimedia Technologies and Communication, University Utara Malaysia, Sintok, MalaysiaSchool of Multimedia Technologies and Communication, University Utara Malaysia, Sintok, MalaysiaSchool of Multimedia Technologies and Communication, University Utara Malaysia, Sintok, MalaysiaDepartment of Medical Genetics, College of Basic Medicine and Forensic Medicine, Henan University of Science and Technology, Luoyang, ChinaWe introduce a cutting-edge, GPT-assisted approach to optimizing visual scene retargeting by learning human gaze behaviors. RecompGPT leverages the power of Generative GPT to model human gaze dynamics, providing an advanced mechanism for intelligent and interactive image reconstruction. The framework incorporates a hierarchical structure, utilizing the BING objectness metric to identify key visual elements in exhibits. Through the integration of multimodal data, RecompGPT employs a novel Locality-Preserved and Observer-Like Active Learning (LOAL) strategy to incrementally generate Gaze Shift Paths (GSPs). LOAL is an active-learning algorithm that selects multiple representative patches from each scene image. It simultaneously preserves the local distribution of image patches and chooses representative (or visually salient) regions in a way that mimics human gaze allocation. Then, we deploy GPT to learn the distribution of the initial human gaze fixation toward different sceneries. Afterward, these GSPs are refined via a multi-layer aggregation algorithm that encodes deep feature representations into a Gaussian Mixture Model (GMM) to model the distribution of human gaze patterns. The learned GMM guides the scene recomposition process by maximizing the posterior estimation. Empirical evaluations, including interactive user studies, demonstrate RecompGPT’s superiority over existing methods, achieving a 4.39% improvement in precision and reducing testing time by 50%. RecompGPT harmonizes cutting-edge algorithmic efficiency with human-centered aesthetics, pushing the boundaries of AI-driven scene analysis and visual recomposition. It enhances the interactivity and immersion of virtual reality, offering a next-generation approach to interactive scene recomposition that aligns with human gaze patterns and cognitive preferences.https://ieeexplore.ieee.org/document/11078273/RecompGPThuman gaze behaviorGSPrecompositionobserver-like |
| spellingShingle | Wang Shang Nassiriah Binti Shaari Nur Sauri Bin Yahaya Liu Hao RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition IEEE Access RecompGPT human gaze behavior GSP recomposition observer-like |
| title | RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition |
| title_full | RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition |
| title_fullStr | RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition |
| title_full_unstemmed | RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition |
| title_short | RecompGPT: Generative Pre-Trained Transformers-Assisted Interactive Human Gaze Pattern Learning and Distribution Modeling for Scene Recomposition |
| title_sort | recompgpt generative pre trained transformers assisted interactive human gaze pattern learning and distribution modeling for scene recomposition |
| topic | RecompGPT human gaze behavior GSP recomposition observer-like |
| url | https://ieeexplore.ieee.org/document/11078273/ |
| work_keys_str_mv | AT wangshang recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition AT nassiriahbintishaari recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition AT nursauribinyahaya recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition AT liuhao recompgptgenerativepretrainedtransformersassistedinteractivehumangazepatternlearninganddistributionmodelingforscenerecomposition |