A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion
The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Journal of Imaging |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2313-433X/11/4/99 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850144843170316288 |
|---|---|
| author | Zhaoyu Shou Yanjun Lin Jianwen Mo Ziyong Wu |
| author_facet | Zhaoyu Shou Yanjun Lin Jianwen Mo Ziyong Wu |
| author_sort | Zhaoyu Shou |
| collection | DOAJ |
| description | The complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze estimation models lack a focus on extracting essential features and fail to effectively model the spatio-temporal relationships among the head, face, and eye regions, which limits their ability to achieve lower angular errors. This paper proposes an appearance-based gaze estimation model (RSP-MCGaze). The model constructs a feature extraction backbone network for gaze estimation (ResNetSC) by integrating ResNet and SCConv; this integration enhances the model’s ability to extract important features while reducing spatial and channel redundancy. Based on the ResNetSC backbone, the method for video gaze estimation was further optimized by jointly locating the head, eyes, and face. The experimental results demonstrate that our model achieves significantly higher performance compared to existing baseline models on public datasets, thereby fully confirming the superiority of our method in the gaze estimation task. The model achieves a detection error of 9.86 on the Gaze360 dataset and a detection error of 7.11 on the detectable face subset of Gaze360. |
| format | Article |
| id | doaj-art-0744363965af4f3bbaa203db07474196 |
| institution | OA Journals |
| issn | 2313-433X |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Journal of Imaging |
| spelling | doaj-art-0744363965af4f3bbaa203db074741962025-08-20T02:28:15ZengMDPI AGJournal of Imaging2313-433X2025-03-011149910.3390/jimaging11040099A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue FusionZhaoyu Shou0Yanjun Lin1Jianwen Mo2Ziyong Wu3School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, ChinaSchool of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, ChinaGuangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, ChinaThe complexity of various factors influencing online learning makes it difficult to characterize learning concentration, while Accurately estimating students’ gaze points during learning video sessions represents a critical scientific challenge in assessing and enhancing the attentiveness of online learners. However, current appearance-based gaze estimation models lack a focus on extracting essential features and fail to effectively model the spatio-temporal relationships among the head, face, and eye regions, which limits their ability to achieve lower angular errors. This paper proposes an appearance-based gaze estimation model (RSP-MCGaze). The model constructs a feature extraction backbone network for gaze estimation (ResNetSC) by integrating ResNet and SCConv; this integration enhances the model’s ability to extract important features while reducing spatial and channel redundancy. Based on the ResNetSC backbone, the method for video gaze estimation was further optimized by jointly locating the head, eyes, and face. The experimental results demonstrate that our model achieves significantly higher performance compared to existing baseline models on public datasets, thereby fully confirming the superiority of our method in the gaze estimation task. The model achieves a detection error of 9.86 on the Gaze360 dataset and a detection error of 7.11 on the detectable face subset of Gaze360.https://www.mdpi.com/2313-433X/11/4/99gaze estimationSCConvResNet |
| spellingShingle | Zhaoyu Shou Yanjun Lin Jianwen Mo Ziyong Wu A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion Journal of Imaging gaze estimation SCConv ResNet |
| title | A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion |
| title_full | A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion |
| title_fullStr | A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion |
| title_full_unstemmed | A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion |
| title_short | A Gaze Estimation Method Based on Spatial and Channel Reconstructed ResNet Combined with Multi-Clue Fusion |
| title_sort | gaze estimation method based on spatial and channel reconstructed resnet combined with multi clue fusion |
| topic | gaze estimation SCConv ResNet |
| url | https://www.mdpi.com/2313-433X/11/4/99 |
| work_keys_str_mv | AT zhaoyushou agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT yanjunlin agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT jianwenmo agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT ziyongwu agazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT zhaoyushou gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT yanjunlin gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT jianwenmo gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion AT ziyongwu gazeestimationmethodbasedonspatialandchannelreconstructedresnetcombinedwithmulticluefusion |