Detection of Student Engagement via Transformer-Enhanced Feature Pyramid Networks on Channel-Spatial Attention

One of the most important aspects of contemporary educational systems is student engagement detection, which involves determining how involved, attentive, and active students are in class activities. For educators, this approach is essential as it provides insights into students' learning exper...

Full description

Saved in:
Bibliographic Details
Main Authors: A. Naveen, I. Jeena Jacob, Ajay Kumar Mandava
Format: Article
Language:English
Published: Russian Academy of Sciences, St. Petersburg Federal Research Center 2025-04-01
Series:Информатика и автоматизация
Subjects:
Online Access:https://ia.spcras.ru/index.php/sp/article/view/16715
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:One of the most important aspects of contemporary educational systems is student engagement detection, which involves determining how involved, attentive, and active students are in class activities. For educators, this approach is essential as it provides insights into students' learning experiences, enabling tailored interventions and instructional enhancements. Traditional techniques for evaluating student engagement are often time-consuming and subjective. This study proposes a novel real-time detection framework that leverages Transformer-enhanced Feature Pyramid Networks (FPN) with Channel-Spatial Attention (CSA), referred to as BiusFPN_CSA. The proposed approach automatically analyses student engagement patterns, such as body posture, eye contact, and head position, from visual data streams by integrating cutting-edge deep learning and computer vision techniques. By integrating the attention mechanism of CSA with the hierarchical feature representation capabilities of FPN, the model can accurately detect student engagement levels by capturing contextual and spatial information in the input data. Additionally, by incorporating the Transformer architecture, the model achieves better overall performance by effectively capturing long-range dependencies and semantic relationships within the input sequences. Evaluation using the WACV dataset demonstrates that the proposed model outperforms baseline techniques in terms of accuracy. Specifically, in terms of accuracy, the FPN_CSA_Trans_EH variant of the proposed model outperforms FPN_CSA by 3.28% and 4.98%, respectively. These findings underscore the efficacy of the BiusFPN_CSA framework in real-time student engagement detection, offering educators a valuable tool for enhancing instructional quality, fostering active learning environments, and ultimately improving student outcomes.
ISSN:2713-3192
2713-3206