A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
Drowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-12-01
|
| Series: | Systems and Soft Computing |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2772941925001061 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Drowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate fatigue-related accidents. However, the accuracy of drowsiness prediction may be compromised if studies rely solely on facial features such as eye and mouth movements while overlooking other physiological indicators of fatigue. To address these limitations, we propose a highly effective deep learning-based method for driver drowsiness detection. Unlike traditional binary classification methods that simply distinguish between “drowsy” and “not drowsy,” the proposed Vis-Net model is a hybrid deep neural network that is a combination of the Vision Transformer and neural networks, including MobileNet-V2, Inception-V3, ResNet152-V2, NASNetLarge, and DenseNet. The proposed model harnesses the efficiency of neural networks for feature extraction while leveraging the powerful self-attention mechanisms of Vision Transformers to learn both global and local information from an image to detect and classify levels of drowsiness based on Katajima’s scale. By identifying progressive stages of drowsiness, the system can issue timely warnings, allowing for adaptive interventions that help prevent accidents before the driver reaches a critical state of impairment. Furthermore, to improve response time latency and reduce computational costs, our model integrates Mostafa’s emotion detection framework. This integration optimizes the processing pipeline by filtering non-relevant frames, prioritizing critical fatigue-related facial expressions, and enhancing real-time performance. By leveraging this framework, we significantly reduce processing delays in handling video streams, ensuring real-time driver monitoring with minimal latency. Experimental results demonstrate that our method achieves an impressive 98.94% accuracy across various real-world conditions, including scenarios where drivers wear masks, use glasses, drive in low-light environments, and under normal conditions. |
|---|---|
| ISSN: | 2772-9419 |