Research on Spam Filters Based on NB Algorithm

Spam filtering is a crucial part of network security. As spam becomes more complex, traditional rule-based methods struggle to meet the needs of modern email systems. The SpamAssassin dataset is used in this study to explore the use of the Naive Bayes (NB) algorithm for spam detection. The algorithm...

Full description

Saved in:
Bibliographic Details
Main Author: Su Shengyue
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_01016.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spam filtering is a crucial part of network security. As spam becomes more complex, traditional rule-based methods struggle to meet the needs of modern email systems. The SpamAssassin dataset is used in this study to explore the use of the Naive Bayes (NB) algorithm for spam detection. The algorithm demonstrated high accuracy and efficiency in classifying large-scale text data, achieving an accuracy of 97.74%, a recall rate of 96.60%, and a precision rate of 96.8%, with an F1 score of 0.97. Through confusion matrix and Receiver Operating Characteristic (ROC) curve analyses, the model’s effectiveness in spam filtering was demonstrated by its high True Positive Rate (TPR) and low False Positive Rate (FPR). However, limitations arise from the NB algorithm’s independence assumption, which may affect performance in more complex spam scenarios. Future work may focus on improving the model’s accuracy and robustness by integrating it with other machine learning models, like Support Vector Machines (SVMs) and deep learning techniques, to enhance spam classification capabilities.
ISSN:2271-2097