Eye contact based engagement prediction for efficient human–robot interaction

Abstract This paper introduces a new approach to predict human engagement in human–robot interactions (HRI), focusing on eye contact and distance information. Recognising engagement, particularly its decline, is essential for successful and natural interactions. This requires early, real-time user b...

Full description

Saved in:
Bibliographic Details
Main Authors: Magnus Jung, Ahmed Abdelrahman, Thorsten Hempel, Basheer Al-Tawil, Qiaoyue Yang, Sven Wachsmuth, Ayoub Al-Hamadi
Format: Article
Language:English
Published: Springer 2025-05-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-025-01902-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850111516487974912
author Magnus Jung
Ahmed Abdelrahman
Thorsten Hempel
Basheer Al-Tawil
Qiaoyue Yang
Sven Wachsmuth
Ayoub Al-Hamadi
author_facet Magnus Jung
Ahmed Abdelrahman
Thorsten Hempel
Basheer Al-Tawil
Qiaoyue Yang
Sven Wachsmuth
Ayoub Al-Hamadi
author_sort Magnus Jung
collection DOAJ
description Abstract This paper introduces a new approach to predict human engagement in human–robot interactions (HRI), focusing on eye contact and distance information. Recognising engagement, particularly its decline, is essential for successful and natural interactions. This requires early, real-time user behavior detection. Previous HRI engagement classification approaches use various audiovisual features or adopt end-to-end methods. However, both approaches face challenges: the former risks error accumulation, while the latter suffer from small datasets. The proposed class-sensitive model for capturing engagement in HRI is based on eye contact detection. By analyzing eye contact intensity over time, the model provides a more robust and reliable measure of engagement levels, effectively capturing both temporal dynamics and subtle behavioral changes. Direct eye contact detection, a crucial social signal in human interactions that has not yet been explored as a standalone indicator in HRI, offers a significant advantage in robustness over gaze detection and incorporates additional facial features into the assessment. This approach reduces the number of features from up to over 100 to just two, enabling real-time processing and surpassing state-of-the-art results with 80.73% accuracy and 80.68% F1-Score on the UE-HRI dataset, the primary resource in current engagement detection research. Additionally, cross-dataset testing on a newly recorded dataset with the Tiago robot from Pal Robotics achieved an accuracy of 86.8% and an F1-score of 87.9%. The model employs a sliding window approach and consists of just three fully connected layers for feature fusion and classification, offering a minimalistic yet effective architecture. The study reveals that engagement, traditionally relying on extensive feature sets, can be inferred reliably from temporal eye contact dynamics. The results include a detailed analysis of established engagement levels on the UE-HRI dataset using the proposed model. Additionally, models for more nuanced engagement classification are introduced, showcasing the effectiveness of this minimalistic feature set. These models provide a robust foundation for future research, advancing robotic systems and deepening understanding of HRI, for example by improving real-time social cue detection and creating adaptive engagement strategies in HRI.
format Article
id doaj-art-4b7b26b2d8504bb2a593666e579bb5d1
institution OA Journals
issn 2199-4536
2198-6053
language English
publishDate 2025-05-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj-art-4b7b26b2d8504bb2a593666e579bb5d12025-08-20T02:37:36ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-05-0111712010.1007/s40747-025-01902-zEye contact based engagement prediction for efficient human–robot interactionMagnus Jung0Ahmed Abdelrahman1Thorsten Hempel2Basheer Al-Tawil3Qiaoyue Yang4Sven Wachsmuth5Ayoub Al-Hamadi6Neuro-Informationstechnik, Otto-von-Guericke UniversityNeuro-Informationstechnik, Otto-von-Guericke UniversityNeuro-Informationstechnik, Otto-von-Guericke UniversityNeuro-Informationstechnik, Otto-von-Guericke UniversityTechnische Fakultät, University BielefeldTechnische Fakultät, University BielefeldNeuro-Informationstechnik, Otto-von-Guericke UniversityAbstract This paper introduces a new approach to predict human engagement in human–robot interactions (HRI), focusing on eye contact and distance information. Recognising engagement, particularly its decline, is essential for successful and natural interactions. This requires early, real-time user behavior detection. Previous HRI engagement classification approaches use various audiovisual features or adopt end-to-end methods. However, both approaches face challenges: the former risks error accumulation, while the latter suffer from small datasets. The proposed class-sensitive model for capturing engagement in HRI is based on eye contact detection. By analyzing eye contact intensity over time, the model provides a more robust and reliable measure of engagement levels, effectively capturing both temporal dynamics and subtle behavioral changes. Direct eye contact detection, a crucial social signal in human interactions that has not yet been explored as a standalone indicator in HRI, offers a significant advantage in robustness over gaze detection and incorporates additional facial features into the assessment. This approach reduces the number of features from up to over 100 to just two, enabling real-time processing and surpassing state-of-the-art results with 80.73% accuracy and 80.68% F1-Score on the UE-HRI dataset, the primary resource in current engagement detection research. Additionally, cross-dataset testing on a newly recorded dataset with the Tiago robot from Pal Robotics achieved an accuracy of 86.8% and an F1-score of 87.9%. The model employs a sliding window approach and consists of just three fully connected layers for feature fusion and classification, offering a minimalistic yet effective architecture. The study reveals that engagement, traditionally relying on extensive feature sets, can be inferred reliably from temporal eye contact dynamics. The results include a detailed analysis of established engagement levels on the UE-HRI dataset using the proposed model. Additionally, models for more nuanced engagement classification are introduced, showcasing the effectiveness of this minimalistic feature set. These models provide a robust foundation for future research, advancing robotic systems and deepening understanding of HRI, for example by improving real-time social cue detection and creating adaptive engagement strategies in HRI.https://doi.org/10.1007/s40747-025-01902-zHuman–robot interactionHuman engagement intentionEye contact detectionSocial robotics
spellingShingle Magnus Jung
Ahmed Abdelrahman
Thorsten Hempel
Basheer Al-Tawil
Qiaoyue Yang
Sven Wachsmuth
Ayoub Al-Hamadi
Eye contact based engagement prediction for efficient human–robot interaction
Complex & Intelligent Systems
Human–robot interaction
Human engagement intention
Eye contact detection
Social robotics
title Eye contact based engagement prediction for efficient human–robot interaction
title_full Eye contact based engagement prediction for efficient human–robot interaction
title_fullStr Eye contact based engagement prediction for efficient human–robot interaction
title_full_unstemmed Eye contact based engagement prediction for efficient human–robot interaction
title_short Eye contact based engagement prediction for efficient human–robot interaction
title_sort eye contact based engagement prediction for efficient human robot interaction
topic Human–robot interaction
Human engagement intention
Eye contact detection
Social robotics
url https://doi.org/10.1007/s40747-025-01902-z
work_keys_str_mv AT magnusjung eyecontactbasedengagementpredictionforefficienthumanrobotinteraction
AT ahmedabdelrahman eyecontactbasedengagementpredictionforefficienthumanrobotinteraction
AT thorstenhempel eyecontactbasedengagementpredictionforefficienthumanrobotinteraction
AT basheeraltawil eyecontactbasedengagementpredictionforefficienthumanrobotinteraction
AT qiaoyueyang eyecontactbasedengagementpredictionforefficienthumanrobotinteraction
AT svenwachsmuth eyecontactbasedengagementpredictionforefficienthumanrobotinteraction
AT ayoubalhamadi eyecontactbasedengagementpredictionforefficienthumanrobotinteraction