Detecting Driver Drowsiness Using Hybrid Facial Features and Ensemble Learning
Drowsiness while driving poses a significant risk in terms of road safety, making effective drowsiness detection systems essential for the prevention of accidents. Facial signal-based detection methods have proven to be an effective approach to drowsiness detection. However, they bring challenges ar...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Information |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2078-2489/16/4/294 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Drowsiness while driving poses a significant risk in terms of road safety, making effective drowsiness detection systems essential for the prevention of accidents. Facial signal-based detection methods have proven to be an effective approach to drowsiness detection. However, they bring challenges arising from inter-individual differences among drivers. Variations in facial structure necessitate personalized feature extraction thresholds, yet existing methods apply a uniform threshold, leading to inaccurate feature extraction. Furthermore, many current methods focus on only one or two facial regions, overlooking the possibility that drowsiness may manifest differently across different facial areas among different drivers. To address these issues, we propose a drowsiness detection method that combines an ensemble model with hybrid facial features. This approach enables the accurate extraction of features from four key facial regions—the eye region, mouth contour, head pose, and gaze direction—through adaptive threshold correction to ensure comprehensive coverage. An ensemble model, combining Random Forest, XGBoost, and Multilayer Perceptron with a soft voting criterion, is then employed to classify the drivers’ drowsiness state. Additionally, we use the SHAP method to ensure model explainability and analyze the correlations between features from various facial regions. Trained and tested on the UTA-RLDD dataset, our method achieves a video accuracy (VA) of 86.52%, outperforming similar techniques introduced in recent years. The interpretability analysis demonstrates the value of our approach, offering a valuable reference for future research and contributing significantly to road safety. |
|---|---|
| ISSN: | 2078-2489 |