Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data
Every year, over 50 million people are injured and 1.35 million die in traffic accidents. Risky driving behaviors are responsible for over half of all fatal vehicle accidents. Identifying risky driving behaviors within real-world driving (RWD) datasets is a promising avenue to reduce the mortality b...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2021-01-01
|
| Series: | Journal of Advanced Transportation |
| Online Access: | http://dx.doi.org/10.1155/2021/8819094 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850167661125697536 |
|---|---|
| author | Charles Marks Arash Jahangiri Sahar Ghanipoor Machiani |
| author_facet | Charles Marks Arash Jahangiri Sahar Ghanipoor Machiani |
| author_sort | Charles Marks |
| collection | DOAJ |
| description | Every year, over 50 million people are injured and 1.35 million die in traffic accidents. Risky driving behaviors are responsible for over half of all fatal vehicle accidents. Identifying risky driving behaviors within real-world driving (RWD) datasets is a promising avenue to reduce the mortality burden associated with these unsafe behaviors, but numerous technical hurdles must be overcome to do so. Herein, we describe the implementation of a multistage process for classifying unlabeled RWD data as potentially risky or not. In the first stage, data are reformatted and reduced in preparation for classification. In the second stage, subsets of the reformatted data are labeled as potentially risky (or not) using the Iterative-DBSCAN method. In the third stage, the labeled subsets are then used to fit random forest (RF) classification models—RF models were chosen after they were found to be performing better than logistic regression and artificial neural network models. In the final stage, the RF models are used predictively to label the remaining RWD data as potentially risky (or not). The implementation of each stage is described and analyzed for the classification of RWD data from vehicles on public roads in Ann Arbor, Michigan. Overall, we identified 22.7 million observations of potentially risky driving out of 268.2 million observations. This study provides a novel approach for identifying potentially risky driving behaviors within RWD datasets. As such, this study represents an important step in the implementation of protocols designed to address and prevent the harms associated with risky driving. |
| format | Article |
| id | doaj-art-d83f31d3ebbf4bec899225044b725af7 |
| institution | OA Journals |
| issn | 0197-6729 2042-3195 |
| language | English |
| publishDate | 2021-01-01 |
| publisher | Wiley |
| record_format | Article |
| series | Journal of Advanced Transportation |
| spelling | doaj-art-d83f31d3ebbf4bec899225044b725af72025-08-20T02:21:10ZengWileyJournal of Advanced Transportation0197-67292042-31952021-01-01202110.1155/2021/88190948819094Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving DataCharles Marks0Arash Jahangiri1Sahar Ghanipoor Machiani2Interdisciplinary Research on Substance Use Joint Doctoral Program, San Diego State University and the University of California San Diego, San Diego, CA, USADepartment of Civil, Construction, and Environmental Engineering, San Diego State University, San Diego, CA, USADepartment of Civil, Construction, and Environmental Engineering, San Diego State University, San Diego, CA, USAEvery year, over 50 million people are injured and 1.35 million die in traffic accidents. Risky driving behaviors are responsible for over half of all fatal vehicle accidents. Identifying risky driving behaviors within real-world driving (RWD) datasets is a promising avenue to reduce the mortality burden associated with these unsafe behaviors, but numerous technical hurdles must be overcome to do so. Herein, we describe the implementation of a multistage process for classifying unlabeled RWD data as potentially risky or not. In the first stage, data are reformatted and reduced in preparation for classification. In the second stage, subsets of the reformatted data are labeled as potentially risky (or not) using the Iterative-DBSCAN method. In the third stage, the labeled subsets are then used to fit random forest (RF) classification models—RF models were chosen after they were found to be performing better than logistic regression and artificial neural network models. In the final stage, the RF models are used predictively to label the remaining RWD data as potentially risky (or not). The implementation of each stage is described and analyzed for the classification of RWD data from vehicles on public roads in Ann Arbor, Michigan. Overall, we identified 22.7 million observations of potentially risky driving out of 268.2 million observations. This study provides a novel approach for identifying potentially risky driving behaviors within RWD datasets. As such, this study represents an important step in the implementation of protocols designed to address and prevent the harms associated with risky driving.http://dx.doi.org/10.1155/2021/8819094 |
| spellingShingle | Charles Marks Arash Jahangiri Sahar Ghanipoor Machiani Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data Journal of Advanced Transportation |
| title | Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data |
| title_full | Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data |
| title_fullStr | Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data |
| title_full_unstemmed | Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data |
| title_short | Identifying and Labeling Potentially Risky Driving: A Multistage Process Using Real-World Driving Data |
| title_sort | identifying and labeling potentially risky driving a multistage process using real world driving data |
| url | http://dx.doi.org/10.1155/2021/8819094 |
| work_keys_str_mv | AT charlesmarks identifyingandlabelingpotentiallyriskydrivingamultistageprocessusingrealworlddrivingdata AT arashjahangiri identifyingandlabelingpotentiallyriskydrivingamultistageprocessusingrealworlddrivingdata AT saharghanipoormachiani identifyingandlabelingpotentiallyriskydrivingamultistageprocessusingrealworlddrivingdata |