Design of Intelligent Feature Selection Technique for Phishing Detection

Phishing attacks lead to significant threats to individuals and organizations by gaining unauthorized access. The attackers redirect the users to fake websites and steal their credentials and other confidential data. Various techniques are employed to detect phishing using machine learning algorith...

Full description

Saved in:
Bibliographic Details
Main Authors: Sharvari Sagar Patil, Narendra M. Shekokar, Sridhar Chandramohan Iyer
Format: Article
Language:English
Published: IIUM Press, International Islamic University Malaysia 2025-01-01
Series:International Islamic University Malaysia Engineering Journal
Subjects:
Online Access:https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/3337
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549923213901824
author Sharvari Sagar Patil
Narendra M. Shekokar
Sridhar Chandramohan Iyer
author_facet Sharvari Sagar Patil
Narendra M. Shekokar
Sridhar Chandramohan Iyer
author_sort Sharvari Sagar Patil
collection DOAJ
description Phishing attacks lead to significant threats to individuals and organizations by gaining unauthorized access. The attackers redirect the users to fake websites and steal their credentials and other confidential data. Various techniques are employed to detect phishing using machine learning algorithms or static detection techniques that use blacklisting of web URLs. The attackers tend to change their approach to launch an attack, making it difficult for traditional phishing detection techniques to safeguard the user. The performance of conventional detection methods relies on exhaustive data and features selected for classification. Features selected for designing detection systems majorly contribute to the performance of the detection system. Phishing detection techniques rely mainly on static features that are selected based on traditional feature selection or ranking techniques. This paper proposes an innovative approach to phishing detection by designing a feature selection technique using reinforcement learning. A novel reinforcement learning agent is designed that uses a dynamic, adaptive, and data-driven approach to improve classifier performance in phishing detection. The technique is designed to select the features using the RL agent dynamically. We have evaluated our technique using the real-world phishing dataset and compared its performance with the existing techniques. Based on the evaluation, our proposed methodology of dynamic feature selection gives the best accuracy of 99.07 % with the random forest classifier model. Our work contributes to advancing phishing detection methodology by developing a dynamic feature selection technique. ABSTRAK: Serangan pancing data membawa ancaman besar kepada individu dan organisasi dengan mendapatkan akses tanpa kebenaran. Penyerang akan mengalihkan pengguna ke laman web palsu dan mencuri maklumat log masuk serta data sulit yang lain. Pelbagai teknik digunakan bagi mengesan pancing data menggunakan algoritma pembelajaran mesin atau teknik pengesanan statik yang menggunakan URL laman web yang disenarai hitam. Penyerang cenderung mengubah pendekatan mereka untuk melancarkan serangan, menjadikan teknik pengesanan pancing data tradisional sukar bagi melindungi pengguna. Prestasi kaedah pengesanan konvensional bergantung kepada data menyeluruh dan ciri-ciri yang dipilih untuk pengelasan. Teknik pengesanan pancing data kebanyakannya bergantung pada ciri-ciri statik yang dipilih berdasarkan kaedah pemilihan atau penarafan ciri tradisional. Kajian ini mencadangkan pendekatan inovatif bagi pengesanan pancing data dengan mereka bentuk teknik pemilihan ciri menggunakan pembelajaran peneguhan. Ejen pembelajaran peneguhan baru, direka menggunakan pendekatan yang dinamik, adaptif, dan berasaskan data bagi memperbaiki prestasi pengelas dalam pengesanan pancing data. Teknik ini direka untuk memilih ciri-ciri secara dinamik menggunakan ejen RL. Teknik ini dinilai menggunakan dataset pancing data sebenar dan dibanding prestasinya dengan teknik sedia ada. Berdasarkan penilaian, metodologi pemilihan ciri dinamik ini memberikan ketepatan terbaik sebanyak 99.07% dengan model pengelasan rawak. Kerja ini merupakan sumbangan kepada kemajuan metodologi pengesanan pancing data dengan membangunkan teknik pemilihan ciri dinamik.
format Article
id doaj-art-55f6c71d772142e399b7278eee86d275
institution Kabale University
issn 1511-788X
2289-7860
language English
publishDate 2025-01-01
publisher IIUM Press, International Islamic University Malaysia
record_format Article
series International Islamic University Malaysia Engineering Journal
spelling doaj-art-55f6c71d772142e399b7278eee86d2752025-01-10T12:40:40ZengIIUM Press, International Islamic University MalaysiaInternational Islamic University Malaysia Engineering Journal1511-788X2289-78602025-01-0126110.31436/iiumej.v26i1.3337Design of Intelligent Feature Selection Technique for Phishing DetectionSharvari Sagar Patil0https://orcid.org/0000-0002-0721-8788Narendra M. Shekokar1Sridhar Chandramohan Iyer2https://orcid.org/0000-0003-3964-2476Dwarkadas J. Sanghvi College of Engineering Dwarkadas J. Sanghvi College of Engineering Dwarkadas J. Sanghvi College of Engineering Phishing attacks lead to significant threats to individuals and organizations by gaining unauthorized access. The attackers redirect the users to fake websites and steal their credentials and other confidential data. Various techniques are employed to detect phishing using machine learning algorithms or static detection techniques that use blacklisting of web URLs. The attackers tend to change their approach to launch an attack, making it difficult for traditional phishing detection techniques to safeguard the user. The performance of conventional detection methods relies on exhaustive data and features selected for classification. Features selected for designing detection systems majorly contribute to the performance of the detection system. Phishing detection techniques rely mainly on static features that are selected based on traditional feature selection or ranking techniques. This paper proposes an innovative approach to phishing detection by designing a feature selection technique using reinforcement learning. A novel reinforcement learning agent is designed that uses a dynamic, adaptive, and data-driven approach to improve classifier performance in phishing detection. The technique is designed to select the features using the RL agent dynamically. We have evaluated our technique using the real-world phishing dataset and compared its performance with the existing techniques. Based on the evaluation, our proposed methodology of dynamic feature selection gives the best accuracy of 99.07 % with the random forest classifier model. Our work contributes to advancing phishing detection methodology by developing a dynamic feature selection technique. ABSTRAK: Serangan pancing data membawa ancaman besar kepada individu dan organisasi dengan mendapatkan akses tanpa kebenaran. Penyerang akan mengalihkan pengguna ke laman web palsu dan mencuri maklumat log masuk serta data sulit yang lain. Pelbagai teknik digunakan bagi mengesan pancing data menggunakan algoritma pembelajaran mesin atau teknik pengesanan statik yang menggunakan URL laman web yang disenarai hitam. Penyerang cenderung mengubah pendekatan mereka untuk melancarkan serangan, menjadikan teknik pengesanan pancing data tradisional sukar bagi melindungi pengguna. Prestasi kaedah pengesanan konvensional bergantung kepada data menyeluruh dan ciri-ciri yang dipilih untuk pengelasan. Teknik pengesanan pancing data kebanyakannya bergantung pada ciri-ciri statik yang dipilih berdasarkan kaedah pemilihan atau penarafan ciri tradisional. Kajian ini mencadangkan pendekatan inovatif bagi pengesanan pancing data dengan mereka bentuk teknik pemilihan ciri menggunakan pembelajaran peneguhan. Ejen pembelajaran peneguhan baru, direka menggunakan pendekatan yang dinamik, adaptif, dan berasaskan data bagi memperbaiki prestasi pengelas dalam pengesanan pancing data. Teknik ini direka untuk memilih ciri-ciri secara dinamik menggunakan ejen RL. Teknik ini dinilai menggunakan dataset pancing data sebenar dan dibanding prestasinya dengan teknik sedia ada. Berdasarkan penilaian, metodologi pemilihan ciri dinamik ini memberikan ketepatan terbaik sebanyak 99.07% dengan model pengelasan rawak. Kerja ini merupakan sumbangan kepada kemajuan metodologi pengesanan pancing data dengan membangunkan teknik pemilihan ciri dinamik. https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/3337Reinforcement LearningFeature SelectionPhishing DetectionMachine Learning
spellingShingle Sharvari Sagar Patil
Narendra M. Shekokar
Sridhar Chandramohan Iyer
Design of Intelligent Feature Selection Technique for Phishing Detection
International Islamic University Malaysia Engineering Journal
Reinforcement Learning
Feature Selection
Phishing Detection
Machine Learning
title Design of Intelligent Feature Selection Technique for Phishing Detection
title_full Design of Intelligent Feature Selection Technique for Phishing Detection
title_fullStr Design of Intelligent Feature Selection Technique for Phishing Detection
title_full_unstemmed Design of Intelligent Feature Selection Technique for Phishing Detection
title_short Design of Intelligent Feature Selection Technique for Phishing Detection
title_sort design of intelligent feature selection technique for phishing detection
topic Reinforcement Learning
Feature Selection
Phishing Detection
Machine Learning
url https://journals.iium.edu.my/ejournal/index.php/iiumej/article/view/3337
work_keys_str_mv AT sharvarisagarpatil designofintelligentfeatureselectiontechniqueforphishingdetection
AT narendramshekokar designofintelligentfeatureselectiontechniqueforphishingdetection
AT sridharchandramohaniyer designofintelligentfeatureselectiontechniqueforphishingdetection