Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling
Modeling accident investigation reports is crucial for elucidating accident causation mechanisms, analyzing risk evolution processes, and formulating effective accident prevention strategies. However, such reports are typically long, hierarchically structured, and information-dense, posing unique ch...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/14/7928 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849303500668796928 |
|---|---|
| author | Wenhua Zeng Wenhu Tang Diping Yuan Hui Zhang Pinsheng Duan Shikun Hu |
| author_facet | Wenhua Zeng Wenhu Tang Diping Yuan Hui Zhang Pinsheng Duan Shikun Hu |
| author_sort | Wenhua Zeng |
| collection | DOAJ |
| description | Modeling accident investigation reports is crucial for elucidating accident causation mechanisms, analyzing risk evolution processes, and formulating effective accident prevention strategies. However, such reports are typically long, hierarchically structured, and information-dense, posing unique challenges for existing language models. To address these domain-specific characteristics, this study proposes SAFE-Transformer, a Structure-Aware and Format-Enhanced Transformer designed for long-document modeling in the emergency safety context. SAFE-Transformer adopts a dual-stream encoding architecture to separately model symbolic section features and heading text, integrates hierarchical depth and format types into positional encodings, and introduces a dynamic gating unit to adaptively fuse headings with paragraph semantics. We evaluate the model on a multi-label accident intelligence classification task using a real-world corpus of 1632 official reports from high-risk industries. Results demonstrate that SAFE-Transformer effectively captures hierarchical semantic structure and outperforms strong long-text baselines. Further analysis reveals an inverted U-shaped performance trend across varying report lengths and highlights the role of attention sparsity and label distribution in long-text modeling. This work offers a practical solution for structurally complex safety documents and provides methodological insights for downstream applications in safety supervision and risk analysis. |
| format | Article |
| id | doaj-art-7708c4c6e2e54fbbb8a1349f6599eecf |
| institution | Kabale University |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-7708c4c6e2e54fbbb8a1349f6599eecf2025-08-20T03:58:26ZengMDPI AGApplied Sciences2076-34172025-07-011514792810.3390/app15147928Structure-Aware and Format-Enhanced Transformer for Accident Report ModelingWenhua Zeng0Wenhu Tang1Diping Yuan2Hui Zhang3Pinsheng Duan4Shikun Hu5School of Electric Power Engineering, South China University of Technology, Guangzhou 510641, ChinaSchool of Electric Power Engineering, South China University of Technology, Guangzhou 510641, ChinaShenzhen Research Institute, China University of Mining and Technology, Shenzhen 518057, ChinaShenzhen Urban Public Safety and Technology Institute, Shenzhen 518024, ChinaSchool of Mechanics and Civil Engineering, China University of Mining and Technology, Xuzhou 221116, ChinaShenzhen Urban Public Safety and Technology Institute, Shenzhen 518024, ChinaModeling accident investigation reports is crucial for elucidating accident causation mechanisms, analyzing risk evolution processes, and formulating effective accident prevention strategies. However, such reports are typically long, hierarchically structured, and information-dense, posing unique challenges for existing language models. To address these domain-specific characteristics, this study proposes SAFE-Transformer, a Structure-Aware and Format-Enhanced Transformer designed for long-document modeling in the emergency safety context. SAFE-Transformer adopts a dual-stream encoding architecture to separately model symbolic section features and heading text, integrates hierarchical depth and format types into positional encodings, and introduces a dynamic gating unit to adaptively fuse headings with paragraph semantics. We evaluate the model on a multi-label accident intelligence classification task using a real-world corpus of 1632 official reports from high-risk industries. Results demonstrate that SAFE-Transformer effectively captures hierarchical semantic structure and outperforms strong long-text baselines. Further analysis reveals an inverted U-shaped performance trend across varying report lengths and highlights the role of attention sparsity and label distribution in long-text modeling. This work offers a practical solution for structurally complex safety documents and provides methodological insights for downstream applications in safety supervision and risk analysis.https://www.mdpi.com/2076-3417/15/14/7928accident report modelingstructure-aware encodinghierarchical sparse attentionmulti-label classificationsemantic fusionemergency safety intelligence |
| spellingShingle | Wenhua Zeng Wenhu Tang Diping Yuan Hui Zhang Pinsheng Duan Shikun Hu Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling Applied Sciences accident report modeling structure-aware encoding hierarchical sparse attention multi-label classification semantic fusion emergency safety intelligence |
| title | Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling |
| title_full | Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling |
| title_fullStr | Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling |
| title_full_unstemmed | Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling |
| title_short | Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling |
| title_sort | structure aware and format enhanced transformer for accident report modeling |
| topic | accident report modeling structure-aware encoding hierarchical sparse attention multi-label classification semantic fusion emergency safety intelligence |
| url | https://www.mdpi.com/2076-3417/15/14/7928 |
| work_keys_str_mv | AT wenhuazeng structureawareandformatenhancedtransformerforaccidentreportmodeling AT wenhutang structureawareandformatenhancedtransformerforaccidentreportmodeling AT dipingyuan structureawareandformatenhancedtransformerforaccidentreportmodeling AT huizhang structureawareandformatenhancedtransformerforaccidentreportmodeling AT pinshengduan structureawareandformatenhancedtransformerforaccidentreportmodeling AT shikunhu structureawareandformatenhancedtransformerforaccidentreportmodeling |