Eye contact based engagement prediction for efficient human–robot interaction
Abstract This paper introduces a new approach to predict human engagement in human–robot interactions (HRI), focusing on eye contact and distance information. Recognising engagement, particularly its decline, is essential for successful and natural interactions. This requires early, real-time user b...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-05-01
|
| Series: | Complex & Intelligent Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s40747-025-01902-z |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850111516487974912 |
|---|---|
| author | Magnus Jung Ahmed Abdelrahman Thorsten Hempel Basheer Al-Tawil Qiaoyue Yang Sven Wachsmuth Ayoub Al-Hamadi |
| author_facet | Magnus Jung Ahmed Abdelrahman Thorsten Hempel Basheer Al-Tawil Qiaoyue Yang Sven Wachsmuth Ayoub Al-Hamadi |
| author_sort | Magnus Jung |
| collection | DOAJ |
| description | Abstract This paper introduces a new approach to predict human engagement in human–robot interactions (HRI), focusing on eye contact and distance information. Recognising engagement, particularly its decline, is essential for successful and natural interactions. This requires early, real-time user behavior detection. Previous HRI engagement classification approaches use various audiovisual features or adopt end-to-end methods. However, both approaches face challenges: the former risks error accumulation, while the latter suffer from small datasets. The proposed class-sensitive model for capturing engagement in HRI is based on eye contact detection. By analyzing eye contact intensity over time, the model provides a more robust and reliable measure of engagement levels, effectively capturing both temporal dynamics and subtle behavioral changes. Direct eye contact detection, a crucial social signal in human interactions that has not yet been explored as a standalone indicator in HRI, offers a significant advantage in robustness over gaze detection and incorporates additional facial features into the assessment. This approach reduces the number of features from up to over 100 to just two, enabling real-time processing and surpassing state-of-the-art results with 80.73% accuracy and 80.68% F1-Score on the UE-HRI dataset, the primary resource in current engagement detection research. Additionally, cross-dataset testing on a newly recorded dataset with the Tiago robot from Pal Robotics achieved an accuracy of 86.8% and an F1-score of 87.9%. The model employs a sliding window approach and consists of just three fully connected layers for feature fusion and classification, offering a minimalistic yet effective architecture. The study reveals that engagement, traditionally relying on extensive feature sets, can be inferred reliably from temporal eye contact dynamics. The results include a detailed analysis of established engagement levels on the UE-HRI dataset using the proposed model. Additionally, models for more nuanced engagement classification are introduced, showcasing the effectiveness of this minimalistic feature set. These models provide a robust foundation for future research, advancing robotic systems and deepening understanding of HRI, for example by improving real-time social cue detection and creating adaptive engagement strategies in HRI. |
| format | Article |
| id | doaj-art-4b7b26b2d8504bb2a593666e579bb5d1 |
| institution | OA Journals |
| issn | 2199-4536 2198-6053 |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Springer |
| record_format | Article |
| series | Complex & Intelligent Systems |
| spelling | doaj-art-4b7b26b2d8504bb2a593666e579bb5d12025-08-20T02:37:36ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-05-0111712010.1007/s40747-025-01902-zEye contact based engagement prediction for efficient human–robot interactionMagnus Jung0Ahmed Abdelrahman1Thorsten Hempel2Basheer Al-Tawil3Qiaoyue Yang4Sven Wachsmuth5Ayoub Al-Hamadi6Neuro-Informationstechnik, Otto-von-Guericke UniversityNeuro-Informationstechnik, Otto-von-Guericke UniversityNeuro-Informationstechnik, Otto-von-Guericke UniversityNeuro-Informationstechnik, Otto-von-Guericke UniversityTechnische Fakultät, University BielefeldTechnische Fakultät, University BielefeldNeuro-Informationstechnik, Otto-von-Guericke UniversityAbstract This paper introduces a new approach to predict human engagement in human–robot interactions (HRI), focusing on eye contact and distance information. Recognising engagement, particularly its decline, is essential for successful and natural interactions. This requires early, real-time user behavior detection. Previous HRI engagement classification approaches use various audiovisual features or adopt end-to-end methods. However, both approaches face challenges: the former risks error accumulation, while the latter suffer from small datasets. The proposed class-sensitive model for capturing engagement in HRI is based on eye contact detection. By analyzing eye contact intensity over time, the model provides a more robust and reliable measure of engagement levels, effectively capturing both temporal dynamics and subtle behavioral changes. Direct eye contact detection, a crucial social signal in human interactions that has not yet been explored as a standalone indicator in HRI, offers a significant advantage in robustness over gaze detection and incorporates additional facial features into the assessment. This approach reduces the number of features from up to over 100 to just two, enabling real-time processing and surpassing state-of-the-art results with 80.73% accuracy and 80.68% F1-Score on the UE-HRI dataset, the primary resource in current engagement detection research. Additionally, cross-dataset testing on a newly recorded dataset with the Tiago robot from Pal Robotics achieved an accuracy of 86.8% and an F1-score of 87.9%. The model employs a sliding window approach and consists of just three fully connected layers for feature fusion and classification, offering a minimalistic yet effective architecture. The study reveals that engagement, traditionally relying on extensive feature sets, can be inferred reliably from temporal eye contact dynamics. The results include a detailed analysis of established engagement levels on the UE-HRI dataset using the proposed model. Additionally, models for more nuanced engagement classification are introduced, showcasing the effectiveness of this minimalistic feature set. These models provide a robust foundation for future research, advancing robotic systems and deepening understanding of HRI, for example by improving real-time social cue detection and creating adaptive engagement strategies in HRI.https://doi.org/10.1007/s40747-025-01902-zHuman–robot interactionHuman engagement intentionEye contact detectionSocial robotics |
| spellingShingle | Magnus Jung Ahmed Abdelrahman Thorsten Hempel Basheer Al-Tawil Qiaoyue Yang Sven Wachsmuth Ayoub Al-Hamadi Eye contact based engagement prediction for efficient human–robot interaction Complex & Intelligent Systems Human–robot interaction Human engagement intention Eye contact detection Social robotics |
| title | Eye contact based engagement prediction for efficient human–robot interaction |
| title_full | Eye contact based engagement prediction for efficient human–robot interaction |
| title_fullStr | Eye contact based engagement prediction for efficient human–robot interaction |
| title_full_unstemmed | Eye contact based engagement prediction for efficient human–robot interaction |
| title_short | Eye contact based engagement prediction for efficient human–robot interaction |
| title_sort | eye contact based engagement prediction for efficient human robot interaction |
| topic | Human–robot interaction Human engagement intention Eye contact detection Social robotics |
| url | https://doi.org/10.1007/s40747-025-01902-z |
| work_keys_str_mv | AT magnusjung eyecontactbasedengagementpredictionforefficienthumanrobotinteraction AT ahmedabdelrahman eyecontactbasedengagementpredictionforefficienthumanrobotinteraction AT thorstenhempel eyecontactbasedengagementpredictionforefficienthumanrobotinteraction AT basheeraltawil eyecontactbasedengagementpredictionforefficienthumanrobotinteraction AT qiaoyueyang eyecontactbasedengagementpredictionforefficienthumanrobotinteraction AT svenwachsmuth eyecontactbasedengagementpredictionforefficienthumanrobotinteraction AT ayoubalhamadi eyecontactbasedengagementpredictionforefficienthumanrobotinteraction |