A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)

The rapid growth of data demands robust security mechanisms to prevent unauthorized access, making ML-based intrusion detection systems essential. However, high-dimensional data necessitates the need for effective feature selection. This study proposes the Two-Phase Feature Selection framework for I...

Full description

Saved in:
Bibliographic Details
Main Authors: C. Rajathi, Rukmani Panjanathan
Format: Article
Language:English
Published: Taylor & Francis Group 2025-12-01
Series:Applied Artificial Intelligence
Online Access:https://www.tandfonline.com/doi/10.1080/08839514.2025.2539396
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850079073387151360
author C. Rajathi
Rukmani Panjanathan
author_facet C. Rajathi
Rukmani Panjanathan
author_sort C. Rajathi
collection DOAJ
description The rapid growth of data demands robust security mechanisms to prevent unauthorized access, making ML-based intrusion detection systems essential. However, high-dimensional data necessitates the need for effective feature selection. This study proposes the Two-Phase Feature Selection framework for Intrusion Detection (2P-FSID) to enhance model performance and interpretability. In Phase 1, a filter-based approach is employed to select a relevant subset of features, yielding an initial subset S1. These features are further assessed using Mutual Information (MI), Correlation (Corr), and Feature Importance (FI) as part of the Feature Relevance Estimation (FRE) process. A hybrid pruning strategy, comprising dynamic pruning and static pruning, is employed to refine the subset into S3. In Phase 2, Shapley Additive Explanations (SHAP) values are computed to quantify each feature’s influence on classification performance. Features are categorized into either positively or negatively influential. The model is initially trained using positively influential features, and then negatively influential features are iteratively added and evaluated for potential performance improvement, resulting in the final optimized subset S4. Experimental results on the NSL-KDD and UNSW-NB15 datasets demonstrate a reduction in feature space from 41 to 19 and 44 to 17 features, respectively, while achieving high detection accuracies of 95.18% and 92.79%.
format Article
id doaj-art-b9cefa656091419183ba1e7e0705bc8d
institution DOAJ
issn 0883-9514
1087-6545
language English
publishDate 2025-12-01
publisher Taylor & Francis Group
record_format Article
series Applied Artificial Intelligence
spelling doaj-art-b9cefa656091419183ba1e7e0705bc8d2025-08-20T02:45:19ZengTaylor & Francis GroupApplied Artificial Intelligence0883-95141087-65452025-12-0139110.1080/08839514.2025.2539396A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)C. Rajathi0Rukmani Panjanathan1Vellore Institute of Technology, Chennai, Tamil Nadu, IndiaVellore Institute of Technology, Chennai, Tamil Nadu, IndiaThe rapid growth of data demands robust security mechanisms to prevent unauthorized access, making ML-based intrusion detection systems essential. However, high-dimensional data necessitates the need for effective feature selection. This study proposes the Two-Phase Feature Selection framework for Intrusion Detection (2P-FSID) to enhance model performance and interpretability. In Phase 1, a filter-based approach is employed to select a relevant subset of features, yielding an initial subset S1. These features are further assessed using Mutual Information (MI), Correlation (Corr), and Feature Importance (FI) as part of the Feature Relevance Estimation (FRE) process. A hybrid pruning strategy, comprising dynamic pruning and static pruning, is employed to refine the subset into S3. In Phase 2, Shapley Additive Explanations (SHAP) values are computed to quantify each feature’s influence on classification performance. Features are categorized into either positively or negatively influential. The model is initially trained using positively influential features, and then negatively influential features are iteratively added and evaluated for potential performance improvement, resulting in the final optimized subset S4. Experimental results on the NSL-KDD and UNSW-NB15 datasets demonstrate a reduction in feature space from 41 to 19 and 44 to 17 features, respectively, while achieving high detection accuracies of 95.18% and 92.79%.https://www.tandfonline.com/doi/10.1080/08839514.2025.2539396
spellingShingle C. Rajathi
Rukmani Panjanathan
A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)
Applied Artificial Intelligence
title A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)
title_full A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)
title_fullStr A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)
title_full_unstemmed A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)
title_short A Two-Phase Feature Selection Framework for Intrusion Detection System: Balancing Relevance and Computational Efficiency (2P-FSID)
title_sort two phase feature selection framework for intrusion detection system balancing relevance and computational efficiency 2p fsid
url https://www.tandfonline.com/doi/10.1080/08839514.2025.2539396
work_keys_str_mv AT crajathi atwophasefeatureselectionframeworkforintrusiondetectionsystembalancingrelevanceandcomputationalefficiency2pfsid
AT rukmanipanjanathan atwophasefeatureselectionframeworkforintrusiondetectionsystembalancingrelevanceandcomputationalefficiency2pfsid
AT crajathi twophasefeatureselectionframeworkforintrusiondetectionsystembalancingrelevanceandcomputationalefficiency2pfsid
AT rukmanipanjanathan twophasefeatureselectionframeworkforintrusiondetectionsystembalancingrelevanceandcomputationalefficiency2pfsid