Reinforcement learning for an efficient and effective malware investigation during cyber incident response

The ever-escalating prevalence of malware is a serious cybersecurity threat, often requiring advanced post-incident forensic investigation techniques. This paper proposes a framework to enhance malware forensics by leveraging reinforcement learning (RL). The approach combines heuristic and signature...

Full description

Saved in:
Bibliographic Details
Main Authors: Dipo Dunsin, Mohamed Chahine Ghanem, Karim Ouazzane, Vassil Vassilev
Format: Article
Language:English
Published: Elsevier 2025-09-01
Series:High-Confidence Computing
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667295225000030
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849390520854380544
author Dipo Dunsin
Mohamed Chahine Ghanem
Karim Ouazzane
Vassil Vassilev
author_facet Dipo Dunsin
Mohamed Chahine Ghanem
Karim Ouazzane
Vassil Vassilev
author_sort Dipo Dunsin
collection DOAJ
description The ever-escalating prevalence of malware is a serious cybersecurity threat, often requiring advanced post-incident forensic investigation techniques. This paper proposes a framework to enhance malware forensics by leveraging reinforcement learning (RL). The approach combines heuristic and signature-based methods, supported by RL through a unified MDP model, which breaks down malware analysis into distinct states and actions. This optimisation enhances the identification and classification of malware variants. The framework employs Q-learning and other techniques to boost the speed and accuracy of detecting new and unknown malware, outperforming traditional methods. We tested the experimental framework across multiple virtual environments infected with various malware types. The RL agent collected forensic evidence and improved its performance through Q-tables and temporal difference learning. The epsilon-greedy exploration strategy, in conjunction with Q-learning updates, effectively facilitated transitions. The learning rate depended on the complexity of the MDP environment: higher in simpler ones for quicker convergence and lower in more complex ones for stability. This RL-enhanced model significantly reduced the time required for post-incident malware investigations, achieving a high accuracy rate of 94% in identifying malware. These results indicate RL’s potential to revolutionise post-incident forensics investigations in cybersecurity. Future work will incorporate more advanced RL algorithms and large language models (LLMs) to further enhance the effectiveness of malware forensic analysis.
format Article
id doaj-art-c4cd04b6040b4ef2bf531fb8c7cb6f3a
institution Kabale University
issn 2667-2952
language English
publishDate 2025-09-01
publisher Elsevier
record_format Article
series High-Confidence Computing
spelling doaj-art-c4cd04b6040b4ef2bf531fb8c7cb6f3a2025-08-20T03:41:34ZengElsevierHigh-Confidence Computing2667-29522025-09-015310029910.1016/j.hcc.2025.100299Reinforcement learning for an efficient and effective malware investigation during cyber incident responseDipo Dunsin0Mohamed Chahine Ghanem1Karim Ouazzane2Vassil Vassilev3Cyber Security Research Centre, London Metropolitan University, London N7 8DB, UK; Corresponding author.Cyber Security Research Centre, London Metropolitan University, London N7 8DB, UK; Department of Computer Science, University of Liverpool, Liverpool L69 7ZX, UKCyber Security Research Centre, London Metropolitan University, London N7 8DB, UKCyber Security Research Centre, London Metropolitan University, London N7 8DB, UKThe ever-escalating prevalence of malware is a serious cybersecurity threat, often requiring advanced post-incident forensic investigation techniques. This paper proposes a framework to enhance malware forensics by leveraging reinforcement learning (RL). The approach combines heuristic and signature-based methods, supported by RL through a unified MDP model, which breaks down malware analysis into distinct states and actions. This optimisation enhances the identification and classification of malware variants. The framework employs Q-learning and other techniques to boost the speed and accuracy of detecting new and unknown malware, outperforming traditional methods. We tested the experimental framework across multiple virtual environments infected with various malware types. The RL agent collected forensic evidence and improved its performance through Q-tables and temporal difference learning. The epsilon-greedy exploration strategy, in conjunction with Q-learning updates, effectively facilitated transitions. The learning rate depended on the complexity of the MDP environment: higher in simpler ones for quicker convergence and lower in more complex ones for stability. This RL-enhanced model significantly reduced the time required for post-incident malware investigations, achieving a high accuracy rate of 94% in identifying malware. These results indicate RL’s potential to revolutionise post-incident forensics investigations in cybersecurity. Future work will incorporate more advanced RL algorithms and large language models (LLMs) to further enhance the effectiveness of malware forensic analysis.http://www.sciencedirect.com/science/article/pii/S2667295225000030Cyber incidentDigital forensicsArtificial intelligenceReinforcement learningMarkov ChainMDP
spellingShingle Dipo Dunsin
Mohamed Chahine Ghanem
Karim Ouazzane
Vassil Vassilev
Reinforcement learning for an efficient and effective malware investigation during cyber incident response
High-Confidence Computing
Cyber incident
Digital forensics
Artificial intelligence
Reinforcement learning
Markov Chain
MDP
title Reinforcement learning for an efficient and effective malware investigation during cyber incident response
title_full Reinforcement learning for an efficient and effective malware investigation during cyber incident response
title_fullStr Reinforcement learning for an efficient and effective malware investigation during cyber incident response
title_full_unstemmed Reinforcement learning for an efficient and effective malware investigation during cyber incident response
title_short Reinforcement learning for an efficient and effective malware investigation during cyber incident response
title_sort reinforcement learning for an efficient and effective malware investigation during cyber incident response
topic Cyber incident
Digital forensics
Artificial intelligence
Reinforcement learning
Markov Chain
MDP
url http://www.sciencedirect.com/science/article/pii/S2667295225000030
work_keys_str_mv AT dipodunsin reinforcementlearningforanefficientandeffectivemalwareinvestigationduringcyberincidentresponse
AT mohamedchahineghanem reinforcementlearningforanefficientandeffectivemalwareinvestigationduringcyberincidentresponse
AT karimouazzane reinforcementlearningforanefficientandeffectivemalwareinvestigationduringcyberincidentresponse
AT vassilvassilev reinforcementlearningforanefficientandeffectivemalwareinvestigationduringcyberincidentresponse