Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer

Mobile eye tracking captures egocentric vision and is well-suited for naturalistic studies. However, its data is noisy, especially when acquired outdoor with multiple participants over several sessions. Area of interest analysis on moving targets is difficult because A) camera and objects move nonl...

Full description

Saved in:
Bibliographic Details
Main Authors: Julien Mercier, Olivier Ertz, Erwan Bocher
Format: Article
Language:English
Published: MDPI AG 2024-04-01
Series:Journal of Eye Movement Research
Subjects:
Online Access:https://bop.unibe.ch/JEMR/article/view/10934
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850241285833621504
author Julien Mercier
Olivier Ertz
Erwan Bocher
author_facet Julien Mercier
Olivier Ertz
Erwan Bocher
author_sort Julien Mercier
collection DOAJ
description Mobile eye tracking captures egocentric vision and is well-suited for naturalistic studies. However, its data is noisy, especially when acquired outdoor with multiple participants over several sessions. Area of interest analysis on moving targets is difficult because A) camera and objects move nonlinearly and may disappear/reappear from the scene; and B) off-the-shelf analysis tools are limited to linearly moving objects. As a result, researchers resort to time-consuming manual annotation, which limits the use of mobile eye tracking in naturalistic studies. We introduce a method based on a fine-tuned Vision Transformer (ViT) model for classifying frames with overlaying gaze markers. After fine-tuning a model on a manually labelled training set made of 1.98% (=7845 frames) of our entire data for three epochs, our model reached 99.34% accuracy as evaluated on hold-out data. We used the method to quantify participants’ dwell time on a tablet during the outdoor user test of a mobile augmented reality application for biodiversity education. We discuss the benefits and limitations of our approach and its potential to be applied to other contexts.
format Article
id doaj-art-9fb91dc10e0847e899a2dc8be6921189
institution OA Journals
issn 1995-8692
language English
publishDate 2024-04-01
publisher MDPI AG
record_format Article
series Journal of Eye Movement Research
spelling doaj-art-9fb91dc10e0847e899a2dc8be69211892025-08-20T02:00:39ZengMDPI AGJournal of Eye Movement Research1995-86922024-04-0117310.16910/jemr.17.3.3Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformerJulien Mercier0Olivier Ertz1Erwan Bocher2Media Engineering Institute (MEI), School of Engineering and Management Vaud, HES-SO, Yverdon-les-Bains; Lab-STICC, UMR 6285, CNRS, Université Bretagne Sud, F-56000 Vannes, FranceMedia Engineering Institute (MEI), School of Engineering and Management Vaud, HES-SO University of Applied Sciences and Arts Western Switzerland, 1400 Yverdon-les-Bains, SwitzerlandLab-STICC, UMR 6285, CNRS, Université Bretagne Sud, F-56000 Vannes, France Mobile eye tracking captures egocentric vision and is well-suited for naturalistic studies. However, its data is noisy, especially when acquired outdoor with multiple participants over several sessions. Area of interest analysis on moving targets is difficult because A) camera and objects move nonlinearly and may disappear/reappear from the scene; and B) off-the-shelf analysis tools are limited to linearly moving objects. As a result, researchers resort to time-consuming manual annotation, which limits the use of mobile eye tracking in naturalistic studies. We introduce a method based on a fine-tuned Vision Transformer (ViT) model for classifying frames with overlaying gaze markers. After fine-tuning a model on a manually labelled training set made of 1.98% (=7845 frames) of our entire data for three epochs, our model reached 99.34% accuracy as evaluated on hold-out data. We used the method to quantify participants’ dwell time on a tablet during the outdoor user test of a mobile augmented reality application for biodiversity education. We discuss the benefits and limitations of our approach and its potential to be applied to other contexts. https://bop.unibe.ch/JEMR/article/view/10934Mobile Eye Tracking MethodologyDynamic Area of InterestDwell TimeFrame-by-frame analysisVision TransformerLocation-based Augmented Reality
spellingShingle Julien Mercier
Olivier Ertz
Erwan Bocher
Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer
Journal of Eye Movement Research
Mobile Eye Tracking Methodology
Dynamic Area of Interest
Dwell Time
Frame-by-frame analysis
Vision Transformer
Location-based Augmented Reality
title Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer
title_full Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer
title_fullStr Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer
title_full_unstemmed Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer
title_short Quantifying dwell time with location-based augmented reality: Dynamic AOI analysis on mobile eye tracking data with vision transformer
title_sort quantifying dwell time with location based augmented reality dynamic aoi analysis on mobile eye tracking data with vision transformer
topic Mobile Eye Tracking Methodology
Dynamic Area of Interest
Dwell Time
Frame-by-frame analysis
Vision Transformer
Location-based Augmented Reality
url https://bop.unibe.ch/JEMR/article/view/10934
work_keys_str_mv AT julienmercier quantifyingdwelltimewithlocationbasedaugmentedrealitydynamicaoianalysisonmobileeyetrackingdatawithvisiontransformer
AT olivierertz quantifyingdwelltimewithlocationbasedaugmentedrealitydynamicaoianalysisonmobileeyetrackingdatawithvisiontransformer
AT erwanbocher quantifyingdwelltimewithlocationbasedaugmentedrealitydynamicaoianalysisonmobileeyetrackingdatawithvisiontransformer