Balancing Efficiency and Efficacy: A Contextual Bandit-Driven Framework for Multi-Tier Cyber Threat Detection

In response to the rising volume and sophistication of cyber intrusions, data-oriented methods have emerged as critical defensive measures. While machine learning—including neural network-based solutions—has demonstrated strong capabilities in identifying malicious activities, several fundamental ch...

Full description

Saved in:
Bibliographic Details
Main Authors: Ibrahim Mutambik, Abdullah Almuqrin
Format: Article
Language:English
Published: MDPI AG 2025-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/11/6362
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In response to the rising volume and sophistication of cyber intrusions, data-oriented methods have emerged as critical defensive measures. While machine learning—including neural network-based solutions—has demonstrated strong capabilities in identifying malicious activities, several fundamental challenges remain. Chief among these difficulties are the substantial resource demands related to data preprocessing and inference procedures, limited scalability beyond centralized environments, and the necessity of deploying multiple specialized detection models to address diverse stages of the cyber kill chain. This paper introduces a contextual bandit-based reinforcement learning approach, designed to reduce operational expenditures and enhance detection cost-efficiency by introducing an adaptive decision boundary within a layered detection scheme. The proposed framework continually measures the confidence of each participating detection model, applying a reward-driven mechanism to balance cost and accuracy. Specifically, each potential action, representing a particular decision boundary, earns a reward reflecting its overall cost-to-effectiveness ratio, thereby prioritizing reduced overheads. We validated our method using two highly representative datasets that capture prevalent modern-day threats: phishing and malware. Our findings show that this contextual bandit-based strategy adeptly regulates the frequency of resource-intensive detection tasks, significantly lowering both inference and processing expenses. Remarkably, it achieves this reduction with minimal compromise to overall detection accuracy and efficacy.
ISSN:2076-3417