Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learning

Objective. The research aims to detect anomalies in data using machine learning models, in particular random forest and gradient boosting, to analyze network activity and detect cyberattacks. The research topic is relevant as cyber attacks are becoming increasingly complex and sophisticated. Develop...

Full description

Saved in:
Bibliographic Details
Main Authors: A. S. Kechedzhiev, O. L. Tsvetkova, A. I. Dubrovina
Format: Article
Language:Russian
Published: Dagestan State Technical University 2024-10-01
Series:Вестник Дагестанского государственного технического университета: Технические науки
Subjects:
Online Access:https://vestnik.dgtu.ru/jour/article/view/1557
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849410155798593536
author A. S. Kechedzhiev
O. L. Tsvetkova
A. I. Dubrovina
author_facet A. S. Kechedzhiev
O. L. Tsvetkova
A. I. Dubrovina
author_sort A. S. Kechedzhiev
collection DOAJ
description Objective. The research aims to detect anomalies in data using machine learning models, in particular random forest and gradient boosting, to analyze network activity and detect cyberattacks. The research topic is relevant as cyber attacks are becoming increasingly complex and sophisticated. Developing effective methods for detecting anomalies and protecting against cyber threats is becoming a priority for organizations. Method. The research is carried out using two machine learning algorithms: Random Forest and gradient boosting. The process includes analyzing important metrics, visualizing solutions, evaluating the performance of each model, and analyzing error matrices for attack categories. Result. The Random Forest model showed an accuracy of about 94% when using the top 10 important features. The graph provides insight into how the model makes decisions based on features. The Xgboost gradient boosting model achieved high accuracy and reliability of results. The report provides a description of the model's performance for each category. Conclusion. The work done is the result of a comprehensive analysis of a machine learning model designed to detect cyberattacks. It includes several key steps and methods that allow us to evaluate the effectiveness of the model, identify important features, and analyze performance for various attacks.
format Article
id doaj-art-d89dcfddd7194789b9a2270bb2d2d34f
institution Kabale University
issn 2073-6185
2542-095X
language Russian
publishDate 2024-10-01
publisher Dagestan State Technical University
record_format Article
series Вестник Дагестанского государственного технического университета: Технические науки
spelling doaj-art-d89dcfddd7194789b9a2270bb2d2d34f2025-08-20T03:35:14ZrusDagestan State Technical UniversityВестник Дагестанского государственного технического университета: Технические науки2073-61852542-095X2024-10-01513728510.21822/2073-6185-2024-51-3-72-85906Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learningA. S. Kechedzhiev0O. L. Tsvetkova1A. I. Dubrovina2Don State Technical UniversityDon State Technical UniversityDon State Technical UniversityObjective. The research aims to detect anomalies in data using machine learning models, in particular random forest and gradient boosting, to analyze network activity and detect cyberattacks. The research topic is relevant as cyber attacks are becoming increasingly complex and sophisticated. Developing effective methods for detecting anomalies and protecting against cyber threats is becoming a priority for organizations. Method. The research is carried out using two machine learning algorithms: Random Forest and gradient boosting. The process includes analyzing important metrics, visualizing solutions, evaluating the performance of each model, and analyzing error matrices for attack categories. Result. The Random Forest model showed an accuracy of about 94% when using the top 10 important features. The graph provides insight into how the model makes decisions based on features. The Xgboost gradient boosting model achieved high accuracy and reliability of results. The report provides a description of the model's performance for each category. Conclusion. The work done is the result of a comprehensive analysis of a machine learning model designed to detect cyberattacks. It includes several key steps and methods that allow us to evaluate the effectiveness of the model, identify important features, and analyze performance for various attacks.https://vestnik.dgtu.ru/jour/article/view/1557data anomalymachine learningrandom forest algorithmgradient boosting model
spellingShingle A. S. Kechedzhiev
O. L. Tsvetkova
A. I. Dubrovina
Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learning
Вестник Дагестанского государственного технического университета: Технические науки
data anomaly
machine learning
random forest algorithm
gradient boosting model
title Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learning
title_full Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learning
title_fullStr Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learning
title_full_unstemmed Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learning
title_short Methodology for detecting anomalies in cyber attack assessment data using Random Forest and Gradient Boosting in machine learning
title_sort methodology for detecting anomalies in cyber attack assessment data using random forest and gradient boosting in machine learning
topic data anomaly
machine learning
random forest algorithm
gradient boosting model
url https://vestnik.dgtu.ru/jour/article/view/1557
work_keys_str_mv AT askechedzhiev methodologyfordetectinganomaliesincyberattackassessmentdatausingrandomforestandgradientboostinginmachinelearning
AT oltsvetkova methodologyfordetectinganomaliesincyberattackassessmentdatausingrandomforestandgradientboostinginmachinelearning
AT aidubrovina methodologyfordetectinganomaliesincyberattackassessmentdatausingrandomforestandgradientboostinginmachinelearning