Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification

In recent years, semi-supervised generative adversarial networks (SS-GANs) models such as GAN-BERT have achieved promising results on the text classification task. One of the techniques used in these models to mitigate the generator from mode collapse is feature matching (FM). Although FM addresses...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shahriar Shayesteh, Diana Inkpen
Format:	Article
Language:	English
Published:	LibraryPress@UF 2022-05-01
Series:	Proceedings of the International Florida Artificial Intelligence Research Society Conference
Subjects:	nlp text classification negative data augmentation gans
Online Access:	https://journals.flvc.org/FLAIRS/article/view/130722
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849736522321887232
author	Shahriar Shayesteh Diana Inkpen
author_facet	Shahriar Shayesteh Diana Inkpen
author_sort	Shahriar Shayesteh
collection	DOAJ
description	In recent years, semi-supervised generative adversarial networks (SS-GANs) models such as GAN-BERT have achieved promising results on the text classification task. One of the techniques used in these models to mitigate the generator from mode collapse is feature matching (FM). Although FM addresses some of the critical issues of SS-GANs, these models still suffer from mode collapse with missing coverage outside the data manifold. Moreover, FM loosely tries to match the distribution between the real data and the fake generated samples. By doing this, the generator can generate fake samples inside high-density regions in the data manifold, where the discriminator learns to misclassify them as out-of-data-manifold regions. In this work, we employ the negative data augmentation (NDA) technique, for the first time in text classification, to alleviate the mentioned problems. NDA is a unique way of producing out-of-distribution fake examples by applying mixup transformation on the fake samples and augmented real data. In our new model (NDA-GAN), we produce NDA samples by combining the generator's output with the contextual representation of the real data. As a result of the mixing, NDA samples are less likely to place in the high-density regions, and due to blending with real data representations, these samples reasonably preserve a close distance to the data manifold. Consequently, the NDA samples increase the discriminator's power to find the optimal decision boundary. Our experimental results demonstrate that the negative augmented samples improve the overall accuracy of our proposed model and make it more confident when detecting out-of-distribution samples.
format	Article
id	doaj-art-23fc2f8d8a324489b6b1d6cf95003ff9
institution	DOAJ
issn	2334-0754 2334-0762
language	English
publishDate	2022-05-01
publisher	LibraryPress@UF
record_format	Article
series	Proceedings of the International Florida Artificial Intelligence Research Society Conference
spelling	doaj-art-23fc2f8d8a324489b6b1d6cf95003ff92025-08-20T03:07:14ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622022-05-013510.32473/flairs.v35i.13072266921Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text ClassificationShahriar Shayesteh0Diana Inkpen1Electrical Engineering\ University of OttawaUniversity of Ottawa In recent years, semi-supervised generative adversarial networks (SS-GANs) models such as GAN-BERT have achieved promising results on the text classification task. One of the techniques used in these models to mitigate the generator from mode collapse is feature matching (FM). Although FM addresses some of the critical issues of SS-GANs, these models still suffer from mode collapse with missing coverage outside the data manifold. Moreover, FM loosely tries to match the distribution between the real data and the fake generated samples. By doing this, the generator can generate fake samples inside high-density regions in the data manifold, where the discriminator learns to misclassify them as out-of-data-manifold regions. In this work, we employ the negative data augmentation (NDA) technique, for the first time in text classification, to alleviate the mentioned problems. NDA is a unique way of producing out-of-distribution fake examples by applying mixup transformation on the fake samples and augmented real data. In our new model (NDA-GAN), we produce NDA samples by combining the generator's output with the contextual representation of the real data. As a result of the mixing, NDA samples are less likely to place in the high-density regions, and due to blending with real data representations, these samples reasonably preserve a close distance to the data manifold. Consequently, the NDA samples increase the discriminator's power to find the optimal decision boundary. Our experimental results demonstrate that the negative augmented samples improve the overall accuracy of our proposed model and make it more confident when detecting out-of-distribution samples.https://journals.flvc.org/FLAIRS/article/view/130722nlptext classificationnegative data augmentationgans
spellingShingle	Shahriar Shayesteh Diana Inkpen Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification Proceedings of the International Florida Artificial Intelligence Research Society Conference nlp text classification negative data augmentation gans
title	Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification
title_full	Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification
title_fullStr	Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification
title_full_unstemmed	Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification
title_short	Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification
title_sort	generative adversarial learning with negative data augmentation for semi supervised text classification
topic	nlp text classification negative data augmentation gans
url	https://journals.flvc.org/FLAIRS/article/view/130722
work_keys_str_mv	AT shahriarshayesteh generativeadversariallearningwithnegativedataaugmentationforsemisupervisedtextclassification AT dianainkpen generativeadversariallearningwithnegativedataaugmentationforsemisupervisedtextclassification

Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification

Similar Items