Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data

Every year, over 50 million people are injured and 1.35 million die in traffic accidents. Risky driving behaviors are responsible for over half of all fatal vehicle accidents. Identifying risky driving behaviors within real-world driving (RWD) datasets is a promising avenue to reduce the mortality b...

Full description

Saved in:
Bibliographic Details
Main Authors: Charles Marks, Arash Jahangiri, Sahar Ghanipoor Machiani
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Journal of Advanced Transportation
Online Access:http://dx.doi.org/10.1155/2021/8819094
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850167661125697536
author Charles Marks
Arash Jahangiri
Sahar Ghanipoor Machiani
author_facet Charles Marks
Arash Jahangiri
Sahar Ghanipoor Machiani
author_sort Charles Marks
collection DOAJ
description Every year, over 50 million people are injured and 1.35 million die in traffic accidents. Risky driving behaviors are responsible for over half of all fatal vehicle accidents. Identifying risky driving behaviors within real-world driving (RWD) datasets is a promising avenue to reduce the mortality burden associated with these unsafe behaviors, but numerous technical hurdles must be overcome to do so. Herein, we describe the implementation of a multistage process for classifying unlabeled RWD data as potentially risky or not. In the first stage, data are reformatted and reduced in preparation for classification. In the second stage, subsets of the reformatted data are labeled as potentially risky (or not) using the Iterative-DBSCAN method. In the third stage, the labeled subsets are then used to fit random forest (RF) classification models—RF models were chosen after they were found to be performing better than logistic regression and artificial neural network models. In the final stage, the RF models are used predictively to label the remaining RWD data as potentially risky (or not). The implementation of each stage is described and analyzed for the classification of RWD data from vehicles on public roads in Ann Arbor, Michigan. Overall, we identified 22.7 million observations of potentially risky driving out of 268.2 million observations. This study provides a novel approach for identifying potentially risky driving behaviors within RWD datasets. As such, this study represents an important step in the implementation of protocols designed to address and prevent the harms associated with risky driving.
format Article
id doaj-art-d83f31d3ebbf4bec899225044b725af7
institution OA Journals
issn 0197-6729
2042-3195
language English
publishDate 2021-01-01
publisher Wiley
record_format Article
series Journal of Advanced Transportation
spelling doaj-art-d83f31d3ebbf4bec899225044b725af72025-08-20T02:21:10ZengWileyJournal of Advanced Transportation0197-67292042-31952021-01-01202110.1155/2021/88190948819094Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving DataCharles Marks0Arash Jahangiri1Sahar Ghanipoor Machiani2Interdisciplinary Research on Substance Use Joint Doctoral Program, San Diego State University and the University of California San Diego, San Diego, CA, USADepartment of Civil, Construction, and Environmental Engineering, San Diego State University, San Diego, CA, USADepartment of Civil, Construction, and Environmental Engineering, San Diego State University, San Diego, CA, USAEvery year, over 50 million people are injured and 1.35 million die in traffic accidents. Risky driving behaviors are responsible for over half of all fatal vehicle accidents. Identifying risky driving behaviors within real-world driving (RWD) datasets is a promising avenue to reduce the mortality burden associated with these unsafe behaviors, but numerous technical hurdles must be overcome to do so. Herein, we describe the implementation of a multistage process for classifying unlabeled RWD data as potentially risky or not. In the first stage, data are reformatted and reduced in preparation for classification. In the second stage, subsets of the reformatted data are labeled as potentially risky (or not) using the Iterative-DBSCAN method. In the third stage, the labeled subsets are then used to fit random forest (RF) classification models—RF models were chosen after they were found to be performing better than logistic regression and artificial neural network models. In the final stage, the RF models are used predictively to label the remaining RWD data as potentially risky (or not). The implementation of each stage is described and analyzed for the classification of RWD data from vehicles on public roads in Ann Arbor, Michigan. Overall, we identified 22.7 million observations of potentially risky driving out of 268.2 million observations. This study provides a novel approach for identifying potentially risky driving behaviors within RWD datasets. As such, this study represents an important step in the implementation of protocols designed to address and prevent the harms associated with risky driving.http://dx.doi.org/10.1155/2021/8819094
spellingShingle Charles Marks
Arash Jahangiri
Sahar Ghanipoor Machiani
Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data
Journal of Advanced Transportation
title Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data
title_full Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data
title_fullStr Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data
title_full_unstemmed Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data
title_short Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data
title_sort identifying and labeling potentially risky driving a multistage process using real world driving data
url http://dx.doi.org/10.1155/2021/8819094
work_keys_str_mv AT charlesmarks identifyingandlabelingpotentiallyriskydrivingamultistageprocessusingrealworlddrivingdata
AT arashjahangiri identifyingandlabelingpotentiallyriskydrivingamultistageprocessusingrealworlddrivingdata
AT saharghanipoormachiani identifyingandlabelingpotentiallyriskydrivingamultistageprocessusingrealworlddrivingdata