Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling

There are massive amounts of civil aviation safety oversight reports collected each year in the civil aviation of China. The narrative texts of these reports are typically short texts, recording the abnormal events detected during the safety oversight process. In the construction of an intelligent c...

Full description

Saved in:
Bibliographic Details
Main Authors: Yaxi Xu, Zurui Gan, Rengang Guo, Xin Wang, Ke Shi, Pengfei Ma
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Aerospace
Subjects:
Online Access:https://www.mdpi.com/2226-4310/11/10/837
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850205701569249280
author Yaxi Xu
Zurui Gan
Rengang Guo
Xin Wang
Ke Shi
Pengfei Ma
author_facet Yaxi Xu
Zurui Gan
Rengang Guo
Xin Wang
Ke Shi
Pengfei Ma
author_sort Yaxi Xu
collection DOAJ
description There are massive amounts of civil aviation safety oversight reports collected each year in the civil aviation of China. The narrative texts of these reports are typically short texts, recording the abnormal events detected during the safety oversight process. In the construction of an intelligent civil aviation safety oversight system, the automatic classification of safety oversight texts is a key and fundamental task. However, all safety oversight reports are currently analyzed and classified into categories by manual work, which is time consuming and labor intensive. In recent years, pre-trained language models have been applied to various text mining tasks and have proven to be effective. The aim of this paper is to apply text classification to the mining of these narrative texts and to show that text classification technology can be a critical element of the aviation safety oversight report analysis. In this paper, we propose a novel method for the classification of narrative texts in safety oversight reports. Through extensive experiments, we validated the effectiveness of all the proposed components. The experimental results demonstrate that our method outperforms existing methods on the self-built civil aviation safety oversight dataset. This study undertakes a thorough examination of the precision and associated outcomes of the dataset, thereby establishing a solid basis for furnishing valuable insights to enhance data quality and optimize information.
format Article
id doaj-art-0ca5b687d85c4c4891c3a285bf90780d
institution OA Journals
issn 2226-4310
language English
publishDate 2024-10-01
publisher MDPI AG
record_format Article
series Aerospace
spelling doaj-art-0ca5b687d85c4c4891c3a285bf90780d2025-08-20T02:11:01ZengMDPI AGAerospace2226-43102024-10-01111083710.3390/aerospace11100837Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic ModelingYaxi Xu0Zurui Gan1Rengang Guo2Xin Wang3Ke Shi4Pengfei Ma5School of Economics and Management, Civil Aviation Flight University of China, Guanghan 618307, ChinaSchool of Computer Science, Civil Aviation Flight University of China, Guanghan 618307, ChinaCAAC Academy, Civil Aviation Flight University of China, Guanghan 618307, ChinaSchool of Computer Science, Civil Aviation Flight University of China, Guanghan 618307, ChinaCAAC Academy, Civil Aviation Flight University of China, Guanghan 618307, ChinaSchool of Computer Science, Civil Aviation Flight University of China, Guanghan 618307, ChinaThere are massive amounts of civil aviation safety oversight reports collected each year in the civil aviation of China. The narrative texts of these reports are typically short texts, recording the abnormal events detected during the safety oversight process. In the construction of an intelligent civil aviation safety oversight system, the automatic classification of safety oversight texts is a key and fundamental task. However, all safety oversight reports are currently analyzed and classified into categories by manual work, which is time consuming and labor intensive. In recent years, pre-trained language models have been applied to various text mining tasks and have proven to be effective. The aim of this paper is to apply text classification to the mining of these narrative texts and to show that text classification technology can be a critical element of the aviation safety oversight report analysis. In this paper, we propose a novel method for the classification of narrative texts in safety oversight reports. Through extensive experiments, we validated the effectiveness of all the proposed components. The experimental results demonstrate that our method outperforms existing methods on the self-built civil aviation safety oversight dataset. This study undertakes a thorough examination of the precision and associated outcomes of the dataset, thereby establishing a solid basis for furnishing valuable insights to enhance data quality and optimize information.https://www.mdpi.com/2226-4310/11/10/837aviation safetysafety oversightpre-trained language modeltext classificationtopic modeling
spellingShingle Yaxi Xu
Zurui Gan
Rengang Guo
Xin Wang
Ke Shi
Pengfei Ma
Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling
Aerospace
aviation safety
safety oversight
pre-trained language model
text classification
topic modeling
title Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling
title_full Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling
title_fullStr Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling
title_full_unstemmed Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling
title_short Hazard Analysis for Massive Civil Aviation Safety Oversight Reports Using Text Classification and Topic Modeling
title_sort hazard analysis for massive civil aviation safety oversight reports using text classification and topic modeling
topic aviation safety
safety oversight
pre-trained language model
text classification
topic modeling
url https://www.mdpi.com/2226-4310/11/10/837
work_keys_str_mv AT yaxixu hazardanalysisformassivecivilaviationsafetyoversightreportsusingtextclassificationandtopicmodeling
AT zuruigan hazardanalysisformassivecivilaviationsafetyoversightreportsusingtextclassificationandtopicmodeling
AT rengangguo hazardanalysisformassivecivilaviationsafetyoversightreportsusingtextclassificationandtopicmodeling
AT xinwang hazardanalysisformassivecivilaviationsafetyoversightreportsusingtextclassificationandtopicmodeling
AT keshi hazardanalysisformassivecivilaviationsafetyoversightreportsusingtextclassificationandtopicmodeling
AT pengfeima hazardanalysisformassivecivilaviationsafetyoversightreportsusingtextclassificationandtopicmodeling