A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion

Drowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate...

Full description

Saved in:
Bibliographic Details
Main Authors: Thuong-Cang Phan, Anh-Cang Phan, Ngoc-Hoang-Quyen Nguyen
Format: Article
Language:English
Published: Elsevier 2025-12-01
Series:Systems and Soft Computing
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2772941925001061
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849387127427563520
author Thuong-Cang Phan
Anh-Cang Phan
Ngoc-Hoang-Quyen Nguyen
author_facet Thuong-Cang Phan
Anh-Cang Phan
Ngoc-Hoang-Quyen Nguyen
author_sort Thuong-Cang Phan
collection DOAJ
description Drowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate fatigue-related accidents. However, the accuracy of drowsiness prediction may be compromised if studies rely solely on facial features such as eye and mouth movements while overlooking other physiological indicators of fatigue. To address these limitations, we propose a highly effective deep learning-based method for driver drowsiness detection. Unlike traditional binary classification methods that simply distinguish between “drowsy” and “not drowsy,” the proposed Vis-Net model is a hybrid deep neural network that is a combination of the Vision Transformer and neural networks, including MobileNet-V2, Inception-V3, ResNet152-V2, NASNetLarge, and DenseNet. The proposed model harnesses the efficiency of neural networks for feature extraction while leveraging the powerful self-attention mechanisms of Vision Transformers to learn both global and local information from an image to detect and classify levels of drowsiness based on Katajima’s scale. By identifying progressive stages of drowsiness, the system can issue timely warnings, allowing for adaptive interventions that help prevent accidents before the driver reaches a critical state of impairment. Furthermore, to improve response time latency and reduce computational costs, our model integrates Mostafa’s emotion detection framework. This integration optimizes the processing pipeline by filtering non-relevant frames, prioritizing critical fatigue-related facial expressions, and enhancing real-time performance. By leveraging this framework, we significantly reduce processing delays in handling video streams, ensuring real-time driver monitoring with minimal latency. Experimental results demonstrate that our method achieves an impressive 98.94% accuracy across various real-world conditions, including scenarios where drivers wear masks, use glasses, drive in low-light environments, and under normal conditions.
format Article
id doaj-art-27791f6c3f094d87a69bcb9c8cef6f58
institution Kabale University
issn 2772-9419
language English
publishDate 2025-12-01
publisher Elsevier
record_format Article
series Systems and Soft Computing
spelling doaj-art-27791f6c3f094d87a69bcb9c8cef6f582025-08-20T03:55:22ZengElsevierSystems and Soft Computing2772-94192025-12-01720028810.1016/j.sasc.2025.200288A novel approach of drowsiness levels detection using Vis-Net combined with facial emotionThuong-Cang Phan0Anh-Cang Phan1Ngoc-Hoang-Quyen Nguyen2College of Information and Communication Technology, Can Tho University, Can Tho, Viet NamVinh Long University of Technology Education, Vinh Long, Viet Nam; Corresponding author.Vinh Long University of Technology Education, Vinh Long, Viet NamDrowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate fatigue-related accidents. However, the accuracy of drowsiness prediction may be compromised if studies rely solely on facial features such as eye and mouth movements while overlooking other physiological indicators of fatigue. To address these limitations, we propose a highly effective deep learning-based method for driver drowsiness detection. Unlike traditional binary classification methods that simply distinguish between “drowsy” and “not drowsy,” the proposed Vis-Net model is a hybrid deep neural network that is a combination of the Vision Transformer and neural networks, including MobileNet-V2, Inception-V3, ResNet152-V2, NASNetLarge, and DenseNet. The proposed model harnesses the efficiency of neural networks for feature extraction while leveraging the powerful self-attention mechanisms of Vision Transformers to learn both global and local information from an image to detect and classify levels of drowsiness based on Katajima’s scale. By identifying progressive stages of drowsiness, the system can issue timely warnings, allowing for adaptive interventions that help prevent accidents before the driver reaches a critical state of impairment. Furthermore, to improve response time latency and reduce computational costs, our model integrates Mostafa’s emotion detection framework. This integration optimizes the processing pipeline by filtering non-relevant frames, prioritizing critical fatigue-related facial expressions, and enhancing real-time performance. By leveraging this framework, we significantly reduce processing delays in handling video streams, ensuring real-time driver monitoring with minimal latency. Experimental results demonstrate that our method achieves an impressive 98.94% accuracy across various real-world conditions, including scenarios where drivers wear masks, use glasses, drive in low-light environments, and under normal conditions.http://www.sciencedirect.com/science/article/pii/S2772941925001061Drowsiness detectionVision transformerVis-NetLevels of drowsinessDeep learningYou only look once version 8
spellingShingle Thuong-Cang Phan
Anh-Cang Phan
Ngoc-Hoang-Quyen Nguyen
A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
Systems and Soft Computing
Drowsiness detection
Vision transformer
Vis-Net
Levels of drowsiness
Deep learning
You only look once version 8
title A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
title_full A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
title_fullStr A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
title_full_unstemmed A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
title_short A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
title_sort novel approach of drowsiness levels detection using vis net combined with facial emotion
topic Drowsiness detection
Vision transformer
Vis-Net
Levels of drowsiness
Deep learning
You only look once version 8
url http://www.sciencedirect.com/science/article/pii/S2772941925001061
work_keys_str_mv AT thuongcangphan anovelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion
AT anhcangphan anovelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion
AT ngochoangquyennguyen anovelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion
AT thuongcangphan novelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion
AT anhcangphan novelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion
AT ngochoangquyennguyen novelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion