Design and implementation of spam filtering system based on topic model

Spam filtering technology plays a key role in many areas including information security,transmission efficiency,and automatic information classification.However,the emergence of spam affects the user's sense of experience,and can cause unnecessary economic and time loss.The deficiency of spam f...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaohuai KOU, Hua CHENG
Format: Article
Language:zho
Published: Beijing Xintong Media Co., Ltd 2017-11-01
Series:Dianxin kexue
Subjects:
Online Access:http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2017313/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spam filtering technology plays a key role in many areas including information security,transmission efficiency,and automatic information classification.However,the emergence of spam affects the user's sense of experience,and can cause unnecessary economic and time loss.The deficiency of spam filtering technology was researched,and a method of spam classification based on naive Bayesian was put forward based on multiple keywords.In the subject of mail,the theme model was used by LDA to get the related subject and keyword of the message,and Word2Vec was further used to search keyword synonyms and related words,extending the keyword collection.In the classification of mails,the transcendental probability of the words in the training dataset was obtained by statistical learning.Based on the extended keyword collection and its probability,the joint probability of a subject and a message was deduced by the Bayesian formula as a basis for the spam judgment.At the same time,the spam filtering system based on topic model was simple and easy to apply.By comparing experiments with other typical spam filtering method,it is proved that the method of spam classification based on theme model and the improved method based on Word2Vec can effectively improve the accuracy of spam filtering.
ISSN:1000-0801