Machine learning predictive performance in road accident severity: A case study from Thailand

Traffic accidents remain a major cause of fatalities and economic losses worldwide, necessitating the development of accurate predictive models for enhancing road safety and minimizing risks. In Thailand, where road traffic injuries persist as a public health challenge, data-driven approaches can si...

Full description

Saved in:
Bibliographic Details
Main Authors: Ittirit Mohamad, Sajjakaj JomnonKwao, Vatanavongs Ratanavaraha
Format: Article
Language:English
Published: Elsevier 2025-06-01
Series:Results in Engineering
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590123025009089
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Traffic accidents remain a major cause of fatalities and economic losses worldwide, necessitating the development of accurate predictive models for enhancing road safety and minimizing risks. In Thailand, where road traffic injuries persist as a public health challenge, data-driven approaches can significantly contribute to accident prevention strategies. This study evaluates the predictive performance of multiple supervised machine learning algorithms in classifying accident severity, addressing the gap in prior research that lacks a comparative analysis of multiple models trained on large-scale crash data. Eight algorithms were assessed, including Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (kNN), Neural Network (NN), Naïve Bayes (NB), Logistic Regression (LR), and Gradient Boosting (GB).A dataset comprising 112,837 road accidents over a five-year period in Thailand was analyzed, focusing exclusively on incidents where drivers were at fault. The dataset underwent extensive preprocessing, including missing value imputation, data balancing checks, and feature selection to ensure robustness. Among the models tested, Random Forest demonstrated superior performance in the binary classification task, achieving an average class AUC of 0.768, classification accuracy of 0.777, precision of 0.752, and recall of 0.777. Key predictive features include road type (highway), speeding, time of day (daylight), absence of lighting at night, and driver gender. While the model effectively classifies non-fatal accidents, its recall for fatalities remains limited (0.198), highlighting challenges in predicting fatal crashes due to the complex interplay of contributing factors.These findings reinforce the applicability of machine learning in traffic safety research and provide valuable insights for policymakers seeking data-driven interventions. Future work should explore advanced feature engineering and ensemble techniques to enhance fatality prediction accuracy.
ISSN:2590-1230