A Robust Method to Protect Text Classification Models against Adversarial Attacks

Text classification is one of the main tasks in natural language processing. Recently, adversarial attacks have shown a substantial negative impact on neural network-based text classification models. There are few defenses to strengthen model predictions against adversarial attacks; popular among th...

Full description

Saved in:
Bibliographic Details
Main Authors: BALA MALLIKARJUNARAO GARLAPATI, Ajeet Kumar Singh, Srinivasa Rao Chalamala
Format: Article
Language:English
Published: LibraryPress@UF 2022-05-01
Series:Proceedings of the International Florida Artificial Intelligence Research Society Conference
Online Access:https://journals.flvc.org/FLAIRS/article/view/130706
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850271208509014016
author BALA MALLIKARJUNARAO GARLAPATI
Ajeet Kumar Singh
Srinivasa Rao Chalamala
author_facet BALA MALLIKARJUNARAO GARLAPATI
Ajeet Kumar Singh
Srinivasa Rao Chalamala
author_sort BALA MALLIKARJUNARAO GARLAPATI
collection DOAJ
description Text classification is one of the main tasks in natural language processing. Recently, adversarial attacks have shown a substantial negative impact on neural network-based text classification models. There are few defenses to strengthen model predictions against adversarial attacks; popular among them are adversarial training and spelling correction. While adversarial training adds different synonyms to the training data, spelling correction methods defend against character variations at the word level. The diversity and sparseness of adversarial perturbations of different attack methods challenge these approaches. This paper proposes an approach to correct adversarial samples for text classification tasks. Our proposed approach combines grammar correction and spelling correction methods. In this, we use Gramformer for grammar correction and Textblob for spelling correction. These approaches are generic and can be applied to any text classification model without any retraining. We evaluated our approach with two state-of-the-art attacks, DeepWordBug and TextBugger, on three open-source datasets IMDB, CoLA, and AGNews. The experimental results show that our approach can effectively counter adversarial attacks on text classification models while maintaining classification performance on original clean data.
format Article
id doaj-art-b0182b4c00c64be2b745b8178e11b51d
institution OA Journals
issn 2334-0754
2334-0762
language English
publishDate 2022-05-01
publisher LibraryPress@UF
record_format Article
series Proceedings of the International Florida Artificial Intelligence Research Society Conference
spelling doaj-art-b0182b4c00c64be2b745b8178e11b51d2025-08-20T01:52:18ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622022-05-013510.32473/flairs.v35i.13070666905A Robust Method to Protect Text Classification Models against Adversarial AttacksBALA MALLIKARJUNARAO GARLAPATI0Ajeet Kumar SinghSrinivasa Rao ChalamalaTATA CONSULTANCY SERVICESText classification is one of the main tasks in natural language processing. Recently, adversarial attacks have shown a substantial negative impact on neural network-based text classification models. There are few defenses to strengthen model predictions against adversarial attacks; popular among them are adversarial training and spelling correction. While adversarial training adds different synonyms to the training data, spelling correction methods defend against character variations at the word level. The diversity and sparseness of adversarial perturbations of different attack methods challenge these approaches. This paper proposes an approach to correct adversarial samples for text classification tasks. Our proposed approach combines grammar correction and spelling correction methods. In this, we use Gramformer for grammar correction and Textblob for spelling correction. These approaches are generic and can be applied to any text classification model without any retraining. We evaluated our approach with two state-of-the-art attacks, DeepWordBug and TextBugger, on three open-source datasets IMDB, CoLA, and AGNews. The experimental results show that our approach can effectively counter adversarial attacks on text classification models while maintaining classification performance on original clean data.https://journals.flvc.org/FLAIRS/article/view/130706
spellingShingle BALA MALLIKARJUNARAO GARLAPATI
Ajeet Kumar Singh
Srinivasa Rao Chalamala
A Robust Method to Protect Text Classification Models against Adversarial Attacks
Proceedings of the International Florida Artificial Intelligence Research Society Conference
title A Robust Method to Protect Text Classification Models against Adversarial Attacks
title_full A Robust Method to Protect Text Classification Models against Adversarial Attacks
title_fullStr A Robust Method to Protect Text Classification Models against Adversarial Attacks
title_full_unstemmed A Robust Method to Protect Text Classification Models against Adversarial Attacks
title_short A Robust Method to Protect Text Classification Models against Adversarial Attacks
title_sort robust method to protect text classification models against adversarial attacks
url https://journals.flvc.org/FLAIRS/article/view/130706
work_keys_str_mv AT balamallikarjunaraogarlapati arobustmethodtoprotecttextclassificationmodelsagainstadversarialattacks
AT ajeetkumarsingh arobustmethodtoprotecttextclassificationmodelsagainstadversarialattacks
AT srinivasaraochalamala arobustmethodtoprotecttextclassificationmodelsagainstadversarialattacks
AT balamallikarjunaraogarlapati robustmethodtoprotecttextclassificationmodelsagainstadversarialattacks
AT ajeetkumarsingh robustmethodtoprotecttextclassificationmodelsagainstadversarialattacks
AT srinivasaraochalamala robustmethodtoprotecttextclassificationmodelsagainstadversarialattacks