A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion
Drowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-12-01
|
| Series: | Systems and Soft Computing |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2772941925001061 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849387127427563520 |
|---|---|
| author | Thuong-Cang Phan Anh-Cang Phan Ngoc-Hoang-Quyen Nguyen |
| author_facet | Thuong-Cang Phan Anh-Cang Phan Ngoc-Hoang-Quyen Nguyen |
| author_sort | Thuong-Cang Phan |
| collection | DOAJ |
| description | Drowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate fatigue-related accidents. However, the accuracy of drowsiness prediction may be compromised if studies rely solely on facial features such as eye and mouth movements while overlooking other physiological indicators of fatigue. To address these limitations, we propose a highly effective deep learning-based method for driver drowsiness detection. Unlike traditional binary classification methods that simply distinguish between “drowsy” and “not drowsy,” the proposed Vis-Net model is a hybrid deep neural network that is a combination of the Vision Transformer and neural networks, including MobileNet-V2, Inception-V3, ResNet152-V2, NASNetLarge, and DenseNet. The proposed model harnesses the efficiency of neural networks for feature extraction while leveraging the powerful self-attention mechanisms of Vision Transformers to learn both global and local information from an image to detect and classify levels of drowsiness based on Katajima’s scale. By identifying progressive stages of drowsiness, the system can issue timely warnings, allowing for adaptive interventions that help prevent accidents before the driver reaches a critical state of impairment. Furthermore, to improve response time latency and reduce computational costs, our model integrates Mostafa’s emotion detection framework. This integration optimizes the processing pipeline by filtering non-relevant frames, prioritizing critical fatigue-related facial expressions, and enhancing real-time performance. By leveraging this framework, we significantly reduce processing delays in handling video streams, ensuring real-time driver monitoring with minimal latency. Experimental results demonstrate that our method achieves an impressive 98.94% accuracy across various real-world conditions, including scenarios where drivers wear masks, use glasses, drive in low-light environments, and under normal conditions. |
| format | Article |
| id | doaj-art-27791f6c3f094d87a69bcb9c8cef6f58 |
| institution | Kabale University |
| issn | 2772-9419 |
| language | English |
| publishDate | 2025-12-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Systems and Soft Computing |
| spelling | doaj-art-27791f6c3f094d87a69bcb9c8cef6f582025-08-20T03:55:22ZengElsevierSystems and Soft Computing2772-94192025-12-01720028810.1016/j.sasc.2025.200288A novel approach of drowsiness levels detection using Vis-Net combined with facial emotionThuong-Cang Phan0Anh-Cang Phan1Ngoc-Hoang-Quyen Nguyen2College of Information and Communication Technology, Can Tho University, Can Tho, Viet NamVinh Long University of Technology Education, Vinh Long, Viet Nam; Corresponding author.Vinh Long University of Technology Education, Vinh Long, Viet NamDrowsiness while driving is a critical global issue, endangering not only drivers but also other road users. Beyond causing property damage, it leads to severe injuries and fatalities. Extensive research has been conducted to collect data on driver drowsiness and develop warning systems to mitigate fatigue-related accidents. However, the accuracy of drowsiness prediction may be compromised if studies rely solely on facial features such as eye and mouth movements while overlooking other physiological indicators of fatigue. To address these limitations, we propose a highly effective deep learning-based method for driver drowsiness detection. Unlike traditional binary classification methods that simply distinguish between “drowsy” and “not drowsy,” the proposed Vis-Net model is a hybrid deep neural network that is a combination of the Vision Transformer and neural networks, including MobileNet-V2, Inception-V3, ResNet152-V2, NASNetLarge, and DenseNet. The proposed model harnesses the efficiency of neural networks for feature extraction while leveraging the powerful self-attention mechanisms of Vision Transformers to learn both global and local information from an image to detect and classify levels of drowsiness based on Katajima’s scale. By identifying progressive stages of drowsiness, the system can issue timely warnings, allowing for adaptive interventions that help prevent accidents before the driver reaches a critical state of impairment. Furthermore, to improve response time latency and reduce computational costs, our model integrates Mostafa’s emotion detection framework. This integration optimizes the processing pipeline by filtering non-relevant frames, prioritizing critical fatigue-related facial expressions, and enhancing real-time performance. By leveraging this framework, we significantly reduce processing delays in handling video streams, ensuring real-time driver monitoring with minimal latency. Experimental results demonstrate that our method achieves an impressive 98.94% accuracy across various real-world conditions, including scenarios where drivers wear masks, use glasses, drive in low-light environments, and under normal conditions.http://www.sciencedirect.com/science/article/pii/S2772941925001061Drowsiness detectionVision transformerVis-NetLevels of drowsinessDeep learningYou only look once version 8 |
| spellingShingle | Thuong-Cang Phan Anh-Cang Phan Ngoc-Hoang-Quyen Nguyen A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion Systems and Soft Computing Drowsiness detection Vision transformer Vis-Net Levels of drowsiness Deep learning You only look once version 8 |
| title | A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion |
| title_full | A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion |
| title_fullStr | A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion |
| title_full_unstemmed | A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion |
| title_short | A novel approach of drowsiness levels detection using Vis-Net combined with facial emotion |
| title_sort | novel approach of drowsiness levels detection using vis net combined with facial emotion |
| topic | Drowsiness detection Vision transformer Vis-Net Levels of drowsiness Deep learning You only look once version 8 |
| url | http://www.sciencedirect.com/science/article/pii/S2772941925001061 |
| work_keys_str_mv | AT thuongcangphan anovelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion AT anhcangphan anovelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion AT ngochoangquyennguyen anovelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion AT thuongcangphan novelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion AT anhcangphan novelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion AT ngochoangquyennguyen novelapproachofdrowsinesslevelsdetectionusingvisnetcombinedwithfacialemotion |