Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.

A classification prediction model is established based on a nonlinear method-Gradient Boosting Decision Tree (GBDT) to investigate the factors contributing to a perpetrator's escape behavior in hit-and-run crashes. Given the U.S. Crash Report Sampling System (CRSS) dataset, the model is trained...

Full description

Saved in:
Bibliographic Details
Main Authors: Baorui Han, Haibo Huang, Gen Li, Chenming Jiang, Zhen Yang, Zhenjun Zhu
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0314939
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841555626756407296
author Baorui Han
Haibo Huang
Gen Li
Chenming Jiang
Zhen Yang
Zhenjun Zhu
author_facet Baorui Han
Haibo Huang
Gen Li
Chenming Jiang
Zhen Yang
Zhenjun Zhu
author_sort Baorui Han
collection DOAJ
description A classification prediction model is established based on a nonlinear method-Gradient Boosting Decision Tree (GBDT) to investigate the factors contributing to a perpetrator's escape behavior in hit-and-run crashes. Given the U.S. Crash Report Sampling System (CRSS) dataset, the model is trained and compared with the state-of-art methods (Classification and Regression Tree, Random Forest, and Logistic Regression). The results show that the GBDT outperforms other methods, achieving the lowest negative log-likelihood (0.282), misclassification rate (0.096), and the highest AUC (0.803). GBDT also demonstrates superior computational efficiency, with a LIFT value of 4.087, making it a more accurate and efficient model for predicting hit-and-run crashes compared to CART, Random Forest, and Logistic Regression. The results obtained from the GBDT show that the relative importance of crash type and relation to trafficway rank 4th and 5th, respectively. Neither is mentioned in previous studies, indicating that GBDT has the ability to mine hidden information. In addition, the interaction between influencing variables can also be obtained to investigate the joint effect of various variables. The results of this study have practical applications in hit-and-run incident prevention, accident safety analysis, and other engineering applications.
format Article
id doaj-art-bd14c09984f044e092af84d753451f3a
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-bd14c09984f044e092af84d753451f3a2025-01-08T05:31:41ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01201e031493910.1371/journal.pone.0314939Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.Baorui HanHaibo HuangGen LiChenming JiangZhen YangZhenjun ZhuA classification prediction model is established based on a nonlinear method-Gradient Boosting Decision Tree (GBDT) to investigate the factors contributing to a perpetrator's escape behavior in hit-and-run crashes. Given the U.S. Crash Report Sampling System (CRSS) dataset, the model is trained and compared with the state-of-art methods (Classification and Regression Tree, Random Forest, and Logistic Regression). The results show that the GBDT outperforms other methods, achieving the lowest negative log-likelihood (0.282), misclassification rate (0.096), and the highest AUC (0.803). GBDT also demonstrates superior computational efficiency, with a LIFT value of 4.087, making it a more accurate and efficient model for predicting hit-and-run crashes compared to CART, Random Forest, and Logistic Regression. The results obtained from the GBDT show that the relative importance of crash type and relation to trafficway rank 4th and 5th, respectively. Neither is mentioned in previous studies, indicating that GBDT has the ability to mine hidden information. In addition, the interaction between influencing variables can also be obtained to investigate the joint effect of various variables. The results of this study have practical applications in hit-and-run incident prevention, accident safety analysis, and other engineering applications.https://doi.org/10.1371/journal.pone.0314939
spellingShingle Baorui Han
Haibo Huang
Gen Li
Chenming Jiang
Zhen Yang
Zhenjun Zhu
Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.
PLoS ONE
title Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.
title_full Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.
title_fullStr Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.
title_full_unstemmed Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.
title_short Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.
title_sort investigating the contributors to hit and run crashes using gradient boosting decision trees
url https://doi.org/10.1371/journal.pone.0314939
work_keys_str_mv AT baoruihan investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees
AT haibohuang investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees
AT genli investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees
AT chenmingjiang investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees
AT zhenyang investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees
AT zhenjunzhu investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees