Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.
A classification prediction model is established based on a nonlinear method-Gradient Boosting Decision Tree (GBDT) to investigate the factors contributing to a perpetrator's escape behavior in hit-and-run crashes. Given the U.S. Crash Report Sampling System (CRSS) dataset, the model is trained...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2025-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0314939 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841555626756407296 |
---|---|
author | Baorui Han Haibo Huang Gen Li Chenming Jiang Zhen Yang Zhenjun Zhu |
author_facet | Baorui Han Haibo Huang Gen Li Chenming Jiang Zhen Yang Zhenjun Zhu |
author_sort | Baorui Han |
collection | DOAJ |
description | A classification prediction model is established based on a nonlinear method-Gradient Boosting Decision Tree (GBDT) to investigate the factors contributing to a perpetrator's escape behavior in hit-and-run crashes. Given the U.S. Crash Report Sampling System (CRSS) dataset, the model is trained and compared with the state-of-art methods (Classification and Regression Tree, Random Forest, and Logistic Regression). The results show that the GBDT outperforms other methods, achieving the lowest negative log-likelihood (0.282), misclassification rate (0.096), and the highest AUC (0.803). GBDT also demonstrates superior computational efficiency, with a LIFT value of 4.087, making it a more accurate and efficient model for predicting hit-and-run crashes compared to CART, Random Forest, and Logistic Regression. The results obtained from the GBDT show that the relative importance of crash type and relation to trafficway rank 4th and 5th, respectively. Neither is mentioned in previous studies, indicating that GBDT has the ability to mine hidden information. In addition, the interaction between influencing variables can also be obtained to investigate the joint effect of various variables. The results of this study have practical applications in hit-and-run incident prevention, accident safety analysis, and other engineering applications. |
format | Article |
id | doaj-art-bd14c09984f044e092af84d753451f3a |
institution | Kabale University |
issn | 1932-6203 |
language | English |
publishDate | 2025-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj-art-bd14c09984f044e092af84d753451f3a2025-01-08T05:31:41ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01201e031493910.1371/journal.pone.0314939Investigating the contributors to hit-and-run crashes using gradient boosting decision trees.Baorui HanHaibo HuangGen LiChenming JiangZhen YangZhenjun ZhuA classification prediction model is established based on a nonlinear method-Gradient Boosting Decision Tree (GBDT) to investigate the factors contributing to a perpetrator's escape behavior in hit-and-run crashes. Given the U.S. Crash Report Sampling System (CRSS) dataset, the model is trained and compared with the state-of-art methods (Classification and Regression Tree, Random Forest, and Logistic Regression). The results show that the GBDT outperforms other methods, achieving the lowest negative log-likelihood (0.282), misclassification rate (0.096), and the highest AUC (0.803). GBDT also demonstrates superior computational efficiency, with a LIFT value of 4.087, making it a more accurate and efficient model for predicting hit-and-run crashes compared to CART, Random Forest, and Logistic Regression. The results obtained from the GBDT show that the relative importance of crash type and relation to trafficway rank 4th and 5th, respectively. Neither is mentioned in previous studies, indicating that GBDT has the ability to mine hidden information. In addition, the interaction between influencing variables can also be obtained to investigate the joint effect of various variables. The results of this study have practical applications in hit-and-run incident prevention, accident safety analysis, and other engineering applications.https://doi.org/10.1371/journal.pone.0314939 |
spellingShingle | Baorui Han Haibo Huang Gen Li Chenming Jiang Zhen Yang Zhenjun Zhu Investigating the contributors to hit-and-run crashes using gradient boosting decision trees. PLoS ONE |
title | Investigating the contributors to hit-and-run crashes using gradient boosting decision trees. |
title_full | Investigating the contributors to hit-and-run crashes using gradient boosting decision trees. |
title_fullStr | Investigating the contributors to hit-and-run crashes using gradient boosting decision trees. |
title_full_unstemmed | Investigating the contributors to hit-and-run crashes using gradient boosting decision trees. |
title_short | Investigating the contributors to hit-and-run crashes using gradient boosting decision trees. |
title_sort | investigating the contributors to hit and run crashes using gradient boosting decision trees |
url | https://doi.org/10.1371/journal.pone.0314939 |
work_keys_str_mv | AT baoruihan investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees AT haibohuang investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees AT genli investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees AT chenmingjiang investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees AT zhenyang investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees AT zhenjunzhu investigatingthecontributorstohitandruncrashesusinggradientboostingdecisiontrees |