Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering

Security in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorp...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang Majun
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04010.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1825206583750033408
author Zhang Majun
author_facet Zhang Majun
author_sort Zhang Majun
collection DOAJ
description Security in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorporate K-means clustering with three classifiers: Random Forest (RF). Naïve Bayes (NB). and Support Vector Machine (SVM) with the goal to boost the accuracy of predicting network intrusions. In tins paper. K-means clustering technique is applied as a preprocessing step to enhance the overall quality of network intrusion detection and maximize the accuracy of the network security measures. The goal is to identify anomalies with high accuracy. Experimental results hidicate that the optimal combination is K-means + RF. which outperformed the others hi precision, recall, and Fl-score. Although K-means + NB demonstrated superior recall for certahi smaller anomalies, it underperformed compared to the RF model. The paper concludes by highlighting the value of ensemble approaches, in particular Random Forest, for tackling anomaly detection and network security issues, particularly hi light of the expanding significance of social networks and the internet.
format Article
id doaj-art-a8d827d16b7041d3b26a12ba5ee0d62e
institution Kabale University
issn 2271-2097
language English
publishDate 2025-01-01
publisher EDP Sciences
record_format Article
series ITM Web of Conferences
spelling doaj-art-a8d827d16b7041d3b26a12ba5ee0d62e2025-02-07T08:21:11ZengEDP SciencesITM Web of Conferences2271-20972025-01-01700401010.1051/itmconf/20257004010itmconf_dai2024_04010Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means ClusteringZhang Majun0College of Electronic and Information Engineering, Tongji UniversitySecurity in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorporate K-means clustering with three classifiers: Random Forest (RF). Naïve Bayes (NB). and Support Vector Machine (SVM) with the goal to boost the accuracy of predicting network intrusions. In tins paper. K-means clustering technique is applied as a preprocessing step to enhance the overall quality of network intrusion detection and maximize the accuracy of the network security measures. The goal is to identify anomalies with high accuracy. Experimental results hidicate that the optimal combination is K-means + RF. which outperformed the others hi precision, recall, and Fl-score. Although K-means + NB demonstrated superior recall for certahi smaller anomalies, it underperformed compared to the RF model. The paper concludes by highlighting the value of ensemble approaches, in particular Random Forest, for tackling anomaly detection and network security issues, particularly hi light of the expanding significance of social networks and the internet.https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04010.pdf
spellingShingle Zhang Majun
Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering
ITM Web of Conferences
title Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering
title_full Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering
title_fullStr Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering
title_full_unstemmed Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering
title_short Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering
title_sort effectiveness evaluation of random forest naive bayes and support vector machine models for kddcup99 anomaly detection based on k means clustering
url https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04010.pdf
work_keys_str_mv AT zhangmajun effectivenessevaluationofrandomforestnaivebayesandsupportvectormachinemodelsforkddcup99anomalydetectionbasedonkmeansclustering