Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering

Security in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorp...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang Majun
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04010.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Security in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorporate K-means clustering with three classifiers: Random Forest (RF). Naïve Bayes (NB). and Support Vector Machine (SVM) with the goal to boost the accuracy of predicting network intrusions. In tins paper. K-means clustering technique is applied as a preprocessing step to enhance the overall quality of network intrusion detection and maximize the accuracy of the network security measures. The goal is to identify anomalies with high accuracy. Experimental results hidicate that the optimal combination is K-means + RF. which outperformed the others hi precision, recall, and Fl-score. Although K-means + NB demonstrated superior recall for certahi smaller anomalies, it underperformed compared to the RF model. The paper concludes by highlighting the value of ensemble approaches, in particular Random Forest, for tackling anomaly detection and network security issues, particularly hi light of the expanding significance of social networks and the internet.
ISSN:2271-2097