An enhanced text classification model by the inverted attention orthogonal projection module

The orthogonal projection method has made significant progress in text classification, especially in generating discriminative features. This method obtains more pure and suitable for classification features by projecting text features onto the orthogonal direction of common features (which are not...

Full description

Saved in:
Bibliographic Details
Main Authors: Hong Zhao, Chenpeng Zhang, Aolong Wang
Format: Article
Language:English
Published: Taylor & Francis Group 2023-12-01
Series:Connection Science
Subjects:
Online Access:http://dx.doi.org/10.1080/09540091.2023.2173145
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849738162846302208
author Hong Zhao
Chenpeng Zhang
Aolong Wang
author_facet Hong Zhao
Chenpeng Zhang
Aolong Wang
author_sort Hong Zhao
collection DOAJ
description The orthogonal projection method has made significant progress in text classification, especially in generating discriminative features. This method obtains more pure and suitable for classification features by projecting text features onto the orthogonal direction of common features (which are not helpful for classification and actually confuse performance). However, this approach requires an additional branch network to generate these common features, which reduces the flexibility of this method compared to representation optimisation methods such as self-attention mechanisms, as it requires significant modification of the base network structure to use. To address this issue, this paper proposes the Inversed Attention Orthogonal Projection Module (IAOPM). IAOPM uses inversed attention (IA) to iteratively reverse the attention map on text features, encouraging the network to remove discriminating features from the text features and obtain potential common features. Unlike the original orthogonal projection method, IAOPM can extract common features within a single network without any branch networks, increasing the flexibility of the orthogonal projection method. We also use an orthogonal loss to ensure the quality of the common features during training, so IAOPM also has better purity performance than the original method. Experiments show that text classification models based on IAOPM outperform the baseline models, self-attention mechanisms, and the original orthogonal projection method on multiple text classification datasets with an average accuracy increase of 1.02%, 0.44%, and 0.52%, respectively.
format Article
id doaj-art-7b82edda8e324e8eaa35faef8e3e4454
institution DOAJ
issn 0954-0091
1360-0494
language English
publishDate 2023-12-01
publisher Taylor & Francis Group
record_format Article
series Connection Science
spelling doaj-art-7b82edda8e324e8eaa35faef8e3e44542025-08-20T03:06:42ZengTaylor & Francis GroupConnection Science0954-00911360-04942023-12-0135110.1080/09540091.2023.21731452173145An enhanced text classification model by the inverted attention orthogonal projection moduleHong Zhao0Chenpeng Zhang1Aolong Wang2Lanzhou University of TechnologyLanzhou University of TechnologyLanzhou University of TechnologyThe orthogonal projection method has made significant progress in text classification, especially in generating discriminative features. This method obtains more pure and suitable for classification features by projecting text features onto the orthogonal direction of common features (which are not helpful for classification and actually confuse performance). However, this approach requires an additional branch network to generate these common features, which reduces the flexibility of this method compared to representation optimisation methods such as self-attention mechanisms, as it requires significant modification of the base network structure to use. To address this issue, this paper proposes the Inversed Attention Orthogonal Projection Module (IAOPM). IAOPM uses inversed attention (IA) to iteratively reverse the attention map on text features, encouraging the network to remove discriminating features from the text features and obtain potential common features. Unlike the original orthogonal projection method, IAOPM can extract common features within a single network without any branch networks, increasing the flexibility of the orthogonal projection method. We also use an orthogonal loss to ensure the quality of the common features during training, so IAOPM also has better purity performance than the original method. Experiments show that text classification models based on IAOPM outperform the baseline models, self-attention mechanisms, and the original orthogonal projection method on multiple text classification datasets with an average accuracy increase of 1.02%, 0.44%, and 0.52%, respectively.http://dx.doi.org/10.1080/09540091.2023.2173145text classificationorthogonal projectioninverted attention
spellingShingle Hong Zhao
Chenpeng Zhang
Aolong Wang
An enhanced text classification model by the inverted attention orthogonal projection module
Connection Science
text classification
orthogonal projection
inverted attention
title An enhanced text classification model by the inverted attention orthogonal projection module
title_full An enhanced text classification model by the inverted attention orthogonal projection module
title_fullStr An enhanced text classification model by the inverted attention orthogonal projection module
title_full_unstemmed An enhanced text classification model by the inverted attention orthogonal projection module
title_short An enhanced text classification model by the inverted attention orthogonal projection module
title_sort enhanced text classification model by the inverted attention orthogonal projection module
topic text classification
orthogonal projection
inverted attention
url http://dx.doi.org/10.1080/09540091.2023.2173145
work_keys_str_mv AT hongzhao anenhancedtextclassificationmodelbytheinvertedattentionorthogonalprojectionmodule
AT chenpengzhang anenhancedtextclassificationmodelbytheinvertedattentionorthogonalprojectionmodule
AT aolongwang anenhancedtextclassificationmodelbytheinvertedattentionorthogonalprojectionmodule
AT hongzhao enhancedtextclassificationmodelbytheinvertedattentionorthogonalprojectionmodule
AT chenpengzhang enhancedtextclassificationmodelbytheinvertedattentionorthogonalprojectionmodule
AT aolongwang enhancedtextclassificationmodelbytheinvertedattentionorthogonalprojectionmodule