Key n-Gram Extractions and Analyses of Different Registers Based on Attention Network

Keyn-gram extraction can be seen as extracting n-grams which can distinguish different registers. Keyword (as n=1, 1-gram is the keyword) extraction models are generally carried out from two aspects, the feature extraction and the model design. By summarizing the advantages and disadvantages of exis...

Full description

Saved in:
Bibliographic Details
Main Authors: Haiyan Wu, Ying Liu, Shaoyun Shi, Qingfeng Wu, Yunlong Huang
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Journal of Applied Mathematics
Online Access:http://dx.doi.org/10.1155/2021/5264090
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832550569270575104
author Haiyan Wu
Ying Liu
Shaoyun Shi
Qingfeng Wu
Yunlong Huang
author_facet Haiyan Wu
Ying Liu
Shaoyun Shi
Qingfeng Wu
Yunlong Huang
author_sort Haiyan Wu
collection DOAJ
description Keyn-gram extraction can be seen as extracting n-grams which can distinguish different registers. Keyword (as n=1, 1-gram is the keyword) extraction models are generally carried out from two aspects, the feature extraction and the model design. By summarizing the advantages and disadvantages of existing models, we propose a novel key n-gram extraction model “attentive n-gram network” (ANN) based on the attention mechanism and multilayer perceptron, in which the attention mechanism scores each n-gram in a sentence by mining the internal semantic relationship between words, and their importance is given by the scores. Experimental results on the real corpus show that the key n-gram extracted from our model can distinguish a novel, news, and text book very well; the accuracy of our model is significantly higher than the baseline model. Also, we conduct experiments on key n-grams extracted from these registers, which turned out to be well clustered. Furthermore, we make some statistical analyses of the results of key n-gram extraction. We find that the key n-grams extracted by our model are very explanatory in linguistics.
format Article
id doaj-art-8539fa294b934bf9a34cca43098e1048
institution Kabale University
issn 1110-757X
1687-0042
language English
publishDate 2021-01-01
publisher Wiley
record_format Article
series Journal of Applied Mathematics
spelling doaj-art-8539fa294b934bf9a34cca43098e10482025-02-03T06:06:27ZengWileyJournal of Applied Mathematics1110-757X1687-00422021-01-01202110.1155/2021/52640905264090Key n-Gram Extractions and Analyses of Different Registers Based on Attention NetworkHaiyan Wu0Ying Liu1Shaoyun Shi2Qingfeng Wu3Yunlong Huang4Zhejiang University of Finance and Economics, Hangzhou 310018, ChinaSchool of Humanities, Tsinghua University, Beijing 100084, ChinaDepartment of Computer Science and Technology, Institute for Artificial Intelligence, Tsinghua University, Beijing 100084, ChinaChengdu Polytechnic, Chengdu 610095, ChinaBeijing Normal University, Beijing 100875, ChinaKeyn-gram extraction can be seen as extracting n-grams which can distinguish different registers. Keyword (as n=1, 1-gram is the keyword) extraction models are generally carried out from two aspects, the feature extraction and the model design. By summarizing the advantages and disadvantages of existing models, we propose a novel key n-gram extraction model “attentive n-gram network” (ANN) based on the attention mechanism and multilayer perceptron, in which the attention mechanism scores each n-gram in a sentence by mining the internal semantic relationship between words, and their importance is given by the scores. Experimental results on the real corpus show that the key n-gram extracted from our model can distinguish a novel, news, and text book very well; the accuracy of our model is significantly higher than the baseline model. Also, we conduct experiments on key n-grams extracted from these registers, which turned out to be well clustered. Furthermore, we make some statistical analyses of the results of key n-gram extraction. We find that the key n-grams extracted by our model are very explanatory in linguistics.http://dx.doi.org/10.1155/2021/5264090
spellingShingle Haiyan Wu
Ying Liu
Shaoyun Shi
Qingfeng Wu
Yunlong Huang
Key n-Gram Extractions and Analyses of Different Registers Based on Attention Network
Journal of Applied Mathematics
title Key n-Gram Extractions and Analyses of Different Registers Based on Attention Network
title_full Key n-Gram Extractions and Analyses of Different Registers Based on Attention Network
title_fullStr Key n-Gram Extractions and Analyses of Different Registers Based on Attention Network
title_full_unstemmed Key n-Gram Extractions and Analyses of Different Registers Based on Attention Network
title_short Key n-Gram Extractions and Analyses of Different Registers Based on Attention Network
title_sort key n gram extractions and analyses of different registers based on attention network
url http://dx.doi.org/10.1155/2021/5264090
work_keys_str_mv AT haiyanwu keyngramextractionsandanalysesofdifferentregistersbasedonattentionnetwork
AT yingliu keyngramextractionsandanalysesofdifferentregistersbasedonattentionnetwork
AT shaoyunshi keyngramextractionsandanalysesofdifferentregistersbasedonattentionnetwork
AT qingfengwu keyngramextractionsandanalysesofdifferentregistersbasedonattentionnetwork
AT yunlonghuang keyngramextractionsandanalysesofdifferentregistersbasedonattentionnetwork