A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion

The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhaoyu Shou, Yanjun Lin, Jianwen Mo, Ziyong Wu
Format:	Article
Language:	English
Published:	MDPI AG 2025-03-01
Series:	Journal of Imaging
Subjects:	gaze estimation SCConv ResNet
Online Access:	https://www.mdpi.com/2313-433X/11/4/99
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850144843170316288
author	Zhaoyu Shou Yanjun Lin Jianwen Mo Ziyong Wu
author_facet	Zhaoyu Shou Yanjun Lin Jianwen Mo Ziyong Wu
author_sort	Zhaoyu Shou
collection	DOAJ
description	The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze estimation models lack a focus on extracting essential features and fail to effectively model the spatio-temporal relationships among the head, face, and eye regions, which limits their ability to achieve lower angular errors. This paper proposes an appearance-based gaze estimation model (RSP-MCGaze). The model constructs a feature extraction backbone network for gaze estimation (ResNetSC) by integrating ResNet and SCConv; this integration enhances the model’s ability to extract important features while reducing spatial and channel redundancy. Based on the ResNetSC backbone, the method for video gaze estimation was further optimized by jointly locating the head, eyes, and face. The experimental results demonstrate that our model achieves significantly higher performance compared to existing baseline models on public datasets, thereby fully confirming the superiority of our method in the gaze estimation task. The model achieves a detection error of 9.86 on the Gaze360 dataset and a detection error of 7.11 on the detectable face subset of Gaze360.
format	Article
id	doaj-art-0744363965af4f3bbaa203db07474196
institution	OA Journals
issn	2313-433X
language	English
publishDate	2025-03-01
publisher	MDPI AG
record_format	Article
series	Journal of Imaging
spelling	doaj-art-0744363965af4f3bbaa203db074741962025-08-20T02:28:15ZengMDPI AGJournal of Imaging2313-433X2025-03-011149910.3390/jimaging11040099A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue FusionZhaoyu Shou0Yanjun Lin1Jianwen Mo2Ziyong Wu3School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, ChinaThe complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze estimation models lack a focus on extracting essential features and fail to effectively model the spatio-temporal relationships among the head, face, and eye regions, which limits their ability to achieve lower angular errors. This paper proposes an appearance-based gaze estimation model (RSP-MCGaze). The model constructs a feature extraction backbone network for gaze estimation (ResNetSC) by integrating ResNet and SCConv; this integration enhances the model’s ability to extract important features while reducing spatial and channel redundancy. Based on the ResNetSC backbone, the method for video gaze estimation was further optimized by jointly locating the head, eyes, and face. The experimental results demonstrate that our model achieves significantly higher performance compared to existing baseline models on public datasets, thereby fully confirming the superiority of our method in the gaze estimation task. The model achieves a detection error of 9.86 on the Gaze360 dataset and a detection error of 7.11 on the detectable face subset of Gaze360.https://www.mdpi.com/2313-433X/11/4/99gaze estimationSCConvResNet
spellingShingle	Zhaoyu Shou Yanjun Lin Jianwen Mo Ziyong Wu A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion Journal of Imaging gaze estimation SCConv ResNet
title	A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
title_full	A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
title_fullStr	A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
title_full_unstemmed	A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
title_short	A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
title_sort	gaze estimation method based on spatial and channel reconstructed resnet combined with multi clue fusion
topic	gaze estimation SCConv ResNet
url	https://www.mdpi.com/2313-433X/11/4/99
work_keys_str_mv	AT zhaoyushou agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT yanjunlin agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT jianwenmo agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT ziyongwu agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT zhaoyushou gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT yanjunlin gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT jianwenmo gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT ziyongwu gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion

A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion

Similar Items