Study on Chinese spam filtering system based on Bayes algorithm
In view of the shortcoming that high dimension of features in the Chinese spam filtering system, a TF-IDF features extraction algorithm was proposed based on the central word extension, the algorithm improves the expression capacity of the node in the network and reduces the dimension of feature. Fu...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2018-12-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000−436x.2018281/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841539403839700992 |
---|---|
author | Haoran LIU Pan DING Changjiang GUO Jinfeng CHANG Jingchuang CUI |
author_facet | Haoran LIU Pan DING Changjiang GUO Jinfeng CHANG Jingchuang CUI |
author_sort | Haoran LIU |
collection | DOAJ |
description | In view of the shortcoming that high dimension of features in the Chinese spam filtering system, a TF-IDF features extraction algorithm was proposed based on the central word extension, the algorithm improves the expression capacity of the node in the network and reduces the dimension of feature. Further, a three-layer structure model based on GWO_GA structure learning algorithm was proposed to expand the limit of text features and improve the diversity of text features. The new structure learning algorithm relaxes the conditional independence assumption of feature properties. A fine classification layer was added between class layer and feature layer to increase feature coverage. The experiment demonstrates that the three-layer Bayesian network algorithm with TF-IDF feature extraction based on the central word extension and GWO_GA structure learning improves the effect of Chinese spam filtering. |
format | Article |
id | doaj-art-10b13846cb364f30af974c1510c0cad9 |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2018-12-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-10b13846cb364f30af974c1510c0cad92025-01-14T07:16:01ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2018-12-013915115959722537Study on Chinese spam filtering system based on Bayes algorithmHaoran LIUPan DINGChangjiang GUOJinfeng CHANGJingchuang CUIIn view of the shortcoming that high dimension of features in the Chinese spam filtering system, a TF-IDF features extraction algorithm was proposed based on the central word extension, the algorithm improves the expression capacity of the node in the network and reduces the dimension of feature. Further, a three-layer structure model based on GWO_GA structure learning algorithm was proposed to expand the limit of text features and improve the diversity of text features. The new structure learning algorithm relaxes the conditional independence assumption of feature properties. A fine classification layer was added between class layer and feature layer to increase feature coverage. The experiment demonstrates that the three-layer Bayesian network algorithm with TF-IDF feature extraction based on the central word extension and GWO_GA structure learning improves the effect of Chinese spam filtering.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000−436x.2018281/Bayesian networkTF-IDFgenetic algorithmshort text classificationChinese spam filtering |
spellingShingle | Haoran LIU Pan DING Changjiang GUO Jinfeng CHANG Jingchuang CUI Study on Chinese spam filtering system based on Bayes algorithm Tongxin xuebao Bayesian network TF-IDF genetic algorithm short text classification Chinese spam filtering |
title | Study on Chinese spam filtering system based on Bayes algorithm |
title_full | Study on Chinese spam filtering system based on Bayes algorithm |
title_fullStr | Study on Chinese spam filtering system based on Bayes algorithm |
title_full_unstemmed | Study on Chinese spam filtering system based on Bayes algorithm |
title_short | Study on Chinese spam filtering system based on Bayes algorithm |
title_sort | study on chinese spam filtering system based on bayes algorithm |
topic | Bayesian network TF-IDF genetic algorithm short text classification Chinese spam filtering |
url | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000−436x.2018281/ |
work_keys_str_mv | AT haoranliu studyonchinesespamfilteringsystembasedonbayesalgorithm AT panding studyonchinesespamfilteringsystembasedonbayesalgorithm AT changjiangguo studyonchinesespamfilteringsystembasedonbayesalgorithm AT jinfengchang studyonchinesespamfilteringsystembasedonbayesalgorithm AT jingchuangcui studyonchinesespamfilteringsystembasedonbayesalgorithm |