ERNIE-TextCNN: research on classification methods of Chinese news headlines in different situations
Abstract Driven by the rapid development of the internet and the era of data explosion, the efficiency of news dissemination has unprecedentedly improved, and the volume of text data has dramatically increased. Facing the public’s demand for “quick browsing,” Chinese news headlines, characterized by...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-08-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-14955-4 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Driven by the rapid development of the internet and the era of data explosion, the efficiency of news dissemination has unprecedentedly improved, and the volume of text data has dramatically increased. Facing the public’s demand for “quick browsing,” Chinese news headlines, characterized by their extremely short text, suffer from limited information, sparse features, and high ambiguity. To rapidly extract deep features from news headlines and enhance the classification performance of extremely short Chinese news headlines, we delve into the inherent characteristics of news headline data, focusing on multi-domain news classification problems and studying datasets of different scales. For the classification of large-scale extremely short Chinese news headline datasets, which are affected by feature sparsity and insufficient representation, we construct an improved convolutional classification model, ERNIE-AAFF-SECNN, based on an adaptive feature fusion mechanism. Firstly, the model employs an attention-based adaptive feature fusion module to dynamically learn and fuse character feature representations output from multiple layers of Transformer in ERNIE, enhancing the model’s deep understanding of the semantic relevance of “characters” in different headlines. Then, it combines BiLSTM networks to capture global feature information and introduces the SE attention mechanism to improve the TextCNN network, achieving weighted convolutional processing of BiLSTM hidden states to extract local feature information further precisely. Finally, it utilizes a double-layer fully connected layer to adjust the output dimensions to fit the classification task, introducing the ReLU activation function between the layers to enhance the model’s expressive power. For the classification of small-scale Chinese news headline datasets, which are limited by size and information scarcity, we construct a depthwise separable convolutional classification model, ERNIE-MSSE-DSCNN, based on a multi-scale SE attention mechanism. Firstly, the dataset is expanded using AEDA data augmentation technology based on word-level information. Then, depthwise separable convolution replaces the traditional convolutional layers in TextCNN, mitigating the risk of overfitting. The MSSE attention mechanism module is proposed to multi-scale integrate global feature information output from convolutional layers, dynamically weighting convolutional feature maps to further enhance the model’s ability to capture key information. Finally, the FGM strategy is introduced for adversarial training to enhance the model’s robustness and generalization. Experiments on two large datasets with different numbers of categories show significant improvements in accuracy compared to multiple models. |
|---|---|
| ISSN: | 2045-2322 |