An enhanced text classification model by the inverted attention orthogonal projection module

The orthogonal projection method has made significant progress in text classification, especially in generating discriminative features. This method obtains more pure and suitable for classification features by projecting text features onto the orthogonal direction of common features (which are not...

Full description

Saved in:
Bibliographic Details
Main Authors: Hong Zhao, Chenpeng Zhang, Aolong Wang
Format: Article
Language:English
Published: Taylor & Francis Group 2023-12-01
Series:Connection Science
Subjects:
Online Access:http://dx.doi.org/10.1080/09540091.2023.2173145
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The orthogonal projection method has made significant progress in text classification, especially in generating discriminative features. This method obtains more pure and suitable for classification features by projecting text features onto the orthogonal direction of common features (which are not helpful for classification and actually confuse performance). However, this approach requires an additional branch network to generate these common features, which reduces the flexibility of this method compared to representation optimisation methods such as self-attention mechanisms, as it requires significant modification of the base network structure to use. To address this issue, this paper proposes the Inversed Attention Orthogonal Projection Module (IAOPM). IAOPM uses inversed attention (IA) to iteratively reverse the attention map on text features, encouraging the network to remove discriminating features from the text features and obtain potential common features. Unlike the original orthogonal projection method, IAOPM can extract common features within a single network without any branch networks, increasing the flexibility of the orthogonal projection method. We also use an orthogonal loss to ensure the quality of the common features during training, so IAOPM also has better purity performance than the original method. Experiments show that text classification models based on IAOPM outperform the baseline models, self-attention mechanisms, and the original orthogonal projection method on multiple text classification datasets with an average accuracy increase of 1.02%, 0.44%, and 0.52%, respectively.
ISSN:0954-0091
1360-0494