Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering
Security in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorp...
Saved in:
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
EDP Sciences
2025-01-01
|
Series: | ITM Web of Conferences |
Online Access: | https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04010.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1825206583750033408 |
---|---|
author | Zhang Majun |
author_facet | Zhang Majun |
author_sort | Zhang Majun |
collection | DOAJ |
description | Security in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorporate K-means clustering with three classifiers: Random Forest (RF). Naïve Bayes (NB). and Support Vector Machine (SVM) with the goal to boost the accuracy of predicting network intrusions. In tins paper. K-means clustering technique is applied as a preprocessing step to enhance the overall quality of network intrusion detection and maximize the accuracy of the network security measures. The goal is to identify anomalies with high accuracy. Experimental results hidicate that the optimal combination is K-means + RF. which outperformed the others hi precision, recall, and Fl-score. Although K-means + NB demonstrated superior recall for certahi smaller anomalies, it underperformed compared to the RF model. The paper concludes by highlighting the value of ensemble approaches, in particular Random Forest, for tackling anomaly detection and network security issues, particularly hi light of the expanding significance of social networks and the internet. |
format | Article |
id | doaj-art-a8d827d16b7041d3b26a12ba5ee0d62e |
institution | Kabale University |
issn | 2271-2097 |
language | English |
publishDate | 2025-01-01 |
publisher | EDP Sciences |
record_format | Article |
series | ITM Web of Conferences |
spelling | doaj-art-a8d827d16b7041d3b26a12ba5ee0d62e2025-02-07T08:21:11ZengEDP SciencesITM Web of Conferences2271-20972025-01-01700401010.1051/itmconf/20257004010itmconf_dai2024_04010Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means ClusteringZhang Majun0College of Electronic and Information Engineering, Tongji UniversitySecurity in the World Wide Web has recently seen an enormous upgrade in almost every aspect. Identifying malicious activities hi a network such as network attacks and malicious users plays a significant role hi these upgraded security directions. This research utilizes the KDDCUP99 dataset to incorporate K-means clustering with three classifiers: Random Forest (RF). Naïve Bayes (NB). and Support Vector Machine (SVM) with the goal to boost the accuracy of predicting network intrusions. In tins paper. K-means clustering technique is applied as a preprocessing step to enhance the overall quality of network intrusion detection and maximize the accuracy of the network security measures. The goal is to identify anomalies with high accuracy. Experimental results hidicate that the optimal combination is K-means + RF. which outperformed the others hi precision, recall, and Fl-score. Although K-means + NB demonstrated superior recall for certahi smaller anomalies, it underperformed compared to the RF model. The paper concludes by highlighting the value of ensemble approaches, in particular Random Forest, for tackling anomaly detection and network security issues, particularly hi light of the expanding significance of social networks and the internet.https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04010.pdf |
spellingShingle | Zhang Majun Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering ITM Web of Conferences |
title | Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering |
title_full | Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering |
title_fullStr | Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering |
title_full_unstemmed | Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering |
title_short | Effectiveness Evaluation of Random Forest, Naive Bayes, and Support Vector Machine Models for KDDCUP99 Anomaly Detection Based on K-means Clustering |
title_sort | effectiveness evaluation of random forest naive bayes and support vector machine models for kddcup99 anomaly detection based on k means clustering |
url | https://www.itm-conferences.org/articles/itmconf/pdf/2025/01/itmconf_dai2024_04010.pdf |
work_keys_str_mv | AT zhangmajun effectivenessevaluationofrandomforestnaivebayesandsupportvectormachinemodelsforkddcup99anomalydetectionbasedonkmeansclustering |