Dataset on fatal road traffic crash attributes extracted via natural language processing of online media articles in IndiaMendeley Data
Road traffic crashes are among the leading causes of death globally, resulting in substantial social and economic impacts. Online media is a key source of public information on road safety. Understanding how crashes are reported is crucial for detecting potential reporting biases and enhancing safet...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-06-01
|
| Series: | Data in Brief |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2352340925003105 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Road traffic crashes are among the leading causes of death globally, resulting in substantial social and economic impacts. Online media is a key source of public information on road safety. Understanding how crashes are reported is crucial for detecting potential reporting biases and enhancing safety awareness. Hence, to address the issue of the lack of high-quality, media-reported fatal crash data, fatal crash reports were extracted for 2022–2023 from The Times of India, a prominent Indian news outlet. The resulting dataset comprised 2898 fatal crashes, 6584 fatalities and 7812 injuries, including 16 detailed crash attributes. This dataset was developed using web scraping and natural language processing (NLP) techniques. Automated tools such as Selenium and BeautifulSoup were employed to extract raw data from the news source. NLP algorithms were then applied to identify key crash attributes, including crash date, location, vehicles involved and number of fatalities. This study provides a replicable framework for constructing robust datasets from media sources, enabling multidisciplinary research on transportation safety, media reporting and public perception of crashes. The dataset is expected to serve as a valuable resource for analysing how the media shapes road safety narratives and for investigations on identifying high-fatality crash locations. |
|---|---|
| ISSN: | 2352-3409 |