Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling

Modeling accident investigation reports is crucial for elucidating accident causation mechanisms, analyzing risk evolution processes, and formulating effective accident prevention strategies. However, such reports are typically long, hierarchically structured, and information-dense, posing unique ch...

Full description

Saved in:
Bibliographic Details
Main Authors: Wenhua Zeng, Wenhu Tang, Diping Yuan, Hui Zhang, Pinsheng Duan, Shikun Hu
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/7928
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849303500668796928
author Wenhua Zeng
Wenhu Tang
Diping Yuan
Hui Zhang
Pinsheng Duan
Shikun Hu
author_facet Wenhua Zeng
Wenhu Tang
Diping Yuan
Hui Zhang
Pinsheng Duan
Shikun Hu
author_sort Wenhua Zeng
collection DOAJ
description Modeling accident investigation reports is crucial for elucidating accident causation mechanisms, analyzing risk evolution processes, and formulating effective accident prevention strategies. However, such reports are typically long, hierarchically structured, and information-dense, posing unique challenges for existing language models. To address these domain-specific characteristics, this study proposes SAFE-Transformer, a Structure-Aware and Format-Enhanced Transformer designed for long-document modeling in the emergency safety context. SAFE-Transformer adopts a dual-stream encoding architecture to separately model symbolic section features and heading text, integrates hierarchical depth and format types into positional encodings, and introduces a dynamic gating unit to adaptively fuse headings with paragraph semantics. We evaluate the model on a multi-label accident intelligence classification task using a real-world corpus of 1632 official reports from high-risk industries. Results demonstrate that SAFE-Transformer effectively captures hierarchical semantic structure and outperforms strong long-text baselines. Further analysis reveals an inverted U-shaped performance trend across varying report lengths and highlights the role of attention sparsity and label distribution in long-text modeling. This work offers a practical solution for structurally complex safety documents and provides methodological insights for downstream applications in safety supervision and risk analysis.
format Article
id doaj-art-7708c4c6e2e54fbbb8a1349f6599eecf
institution Kabale University
issn 2076-3417
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-7708c4c6e2e54fbbb8a1349f6599eecf2025-08-20T03:58:26ZengMDPI AGApplied Sciences2076-34172025-07-011514792810.3390/app15147928Structure-Aware and Format-Enhanced Transformer for Accident Report ModelingWenhua Zeng0Wenhu Tang1Diping Yuan2Hui Zhang3Pinsheng Duan4Shikun Hu5School of Electric Power Engineering, South China University of Technology, Guangzhou 510641, ChinaSchool of Electric Power Engineering, South China University of Technology, Guangzhou 510641, ChinaShenzhen Research Institute, China University of Mining and Technology, Shenzhen 518057, ChinaShenzhen Urban Public Safety and Technology Institute, Shenzhen 518024, ChinaSchool of Mechanics and Civil Engineering, China University of Mining and Technology, Xuzhou 221116, ChinaShenzhen Urban Public Safety and Technology Institute, Shenzhen 518024, ChinaModeling accident investigation reports is crucial for elucidating accident causation mechanisms, analyzing risk evolution processes, and formulating effective accident prevention strategies. However, such reports are typically long, hierarchically structured, and information-dense, posing unique challenges for existing language models. To address these domain-specific characteristics, this study proposes SAFE-Transformer, a Structure-Aware and Format-Enhanced Transformer designed for long-document modeling in the emergency safety context. SAFE-Transformer adopts a dual-stream encoding architecture to separately model symbolic section features and heading text, integrates hierarchical depth and format types into positional encodings, and introduces a dynamic gating unit to adaptively fuse headings with paragraph semantics. We evaluate the model on a multi-label accident intelligence classification task using a real-world corpus of 1632 official reports from high-risk industries. Results demonstrate that SAFE-Transformer effectively captures hierarchical semantic structure and outperforms strong long-text baselines. Further analysis reveals an inverted U-shaped performance trend across varying report lengths and highlights the role of attention sparsity and label distribution in long-text modeling. This work offers a practical solution for structurally complex safety documents and provides methodological insights for downstream applications in safety supervision and risk analysis.https://www.mdpi.com/2076-3417/15/14/7928accident report modelingstructure-aware encodinghierarchical sparse attentionmulti-label classificationsemantic fusionemergency safety intelligence
spellingShingle Wenhua Zeng
Wenhu Tang
Diping Yuan
Hui Zhang
Pinsheng Duan
Shikun Hu
Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling
Applied Sciences
accident report modeling
structure-aware encoding
hierarchical sparse attention
multi-label classification
semantic fusion
emergency safety intelligence
title Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling
title_full Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling
title_fullStr Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling
title_full_unstemmed Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling
title_short Structure-Aware and Format-Enhanced Transformer for Accident Report Modeling
title_sort structure aware and format enhanced transformer for accident report modeling
topic accident report modeling
structure-aware encoding
hierarchical sparse attention
multi-label classification
semantic fusion
emergency safety intelligence
url https://www.mdpi.com/2076-3417/15/14/7928
work_keys_str_mv AT wenhuazeng structureawareandformatenhancedtransformerforaccidentreportmodeling
AT wenhutang structureawareandformatenhancedtransformerforaccidentreportmodeling
AT dipingyuan structureawareandformatenhancedtransformerforaccidentreportmodeling
AT huizhang structureawareandformatenhancedtransformerforaccidentreportmodeling
AT pinshengduan structureawareandformatenhancedtransformerforaccidentreportmodeling
AT shikunhu structureawareandformatenhancedtransformerforaccidentreportmodeling