PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction

Phishing emails remain a critical cybersecurity challenge, demanding detection frameworks that capture both textual semantics and structural relationships in email data. This study introduces PhishingGNN, a hybrid model that integrates DistilBERT for context-aware text analysis with Graph Attention...

Full description

Saved in:
Bibliographic Details
Main Authors: Mejdl Safran, Abdulbaset Musleh
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11091285/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849728159616860160
author Mejdl Safran
Abdulbaset Musleh
author_facet Mejdl Safran
Abdulbaset Musleh
author_sort Mejdl Safran
collection DOAJ
description Phishing emails remain a critical cybersecurity challenge, demanding detection frameworks that capture both textual semantics and structural relationships in email data. This study introduces PhishingGNN, a hybrid model that integrates DistilBERT for context-aware text analysis with Graph Attention Networks (GAT) to model email metadata and content as graph structures, detecting subtle phishing patterns overlooked by traditional methods. By transforming email bodies into relational graphs, PhishingGNN leverages Graph Neural Networks (GNNs) to analyze textual interactions while retaining computational efficiency. Evaluated on an expanded CEAS_08 dataset (39,154 samples: 17,312 non-phishing and 21,842 phishing emails), PhishingGNN achieves state-of-the-art performance: 0.9939 accuracy, balanced precision, recall, and F1-scores of 0.99, and an AUC of 1.00. Cross-dataset validation on the Nazario Corpus confirms robustness (0.9910 accuracy), outperforming contemporary few-shot learning approaches. PhishingGNN’s key innovations include a transformer-GNN architecture unifying semantic and structural reasoning, a novel graph-based email representation methodology, and comprehensive validation confirming real-world scalability. PhishingGNN advances graph-based deep learning in cybersecurity, offering a modular benchmark solution with demonstrated cross-dataset efficacy.
format Article
id doaj-art-3b6144780fa6459a8b9efacbfd593d7d
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-3b6144780fa6459a8b9efacbfd593d7d2025-08-20T03:09:38ZengIEEEIEEE Access2169-35362025-01-011313139013139910.1109/ACCESS.2025.359213511091285PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature ExtractionMejdl Safran0https://orcid.org/0000-0002-7445-7121Abdulbaset Musleh1https://orcid.org/0000-0003-1088-1254Department of Computer Science, College of Computer and Information Sciences, Research Chair of Online Dialogue and Cultural Communication, King Saud University, Riyadh, Saudi ArabiaDepartment of Information Technology, Al-Qalam University for Humanities and Applied Sciences, Ibb, YemenPhishing emails remain a critical cybersecurity challenge, demanding detection frameworks that capture both textual semantics and structural relationships in email data. This study introduces PhishingGNN, a hybrid model that integrates DistilBERT for context-aware text analysis with Graph Attention Networks (GAT) to model email metadata and content as graph structures, detecting subtle phishing patterns overlooked by traditional methods. By transforming email bodies into relational graphs, PhishingGNN leverages Graph Neural Networks (GNNs) to analyze textual interactions while retaining computational efficiency. Evaluated on an expanded CEAS_08 dataset (39,154 samples: 17,312 non-phishing and 21,842 phishing emails), PhishingGNN achieves state-of-the-art performance: 0.9939 accuracy, balanced precision, recall, and F1-scores of 0.99, and an AUC of 1.00. Cross-dataset validation on the Nazario Corpus confirms robustness (0.9910 accuracy), outperforming contemporary few-shot learning approaches. PhishingGNN’s key innovations include a transformer-GNN architecture unifying semantic and structural reasoning, a novel graph-based email representation methodology, and comprehensive validation confirming real-world scalability. PhishingGNN advances graph-based deep learning in cybersecurity, offering a modular benchmark solution with demonstrated cross-dataset efficacy.https://ieeexplore.ieee.org/document/11091285/Phishing emailsgraph attention networks (GAT)DistilBERTgraph neural network (GNN)phishing detection
spellingShingle Mejdl Safran
Abdulbaset Musleh
PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction
IEEE Access
Phishing emails
graph attention networks (GAT)
DistilBERT
graph neural network (GNN)
phishing detection
title PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction
title_full PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction
title_fullStr PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction
title_full_unstemmed PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction
title_short PhishingGNN: Phishing Email Detection Using Graph Attention Networks and Transformer-Based Feature Extraction
title_sort phishinggnn phishing email detection using graph attention networks and transformer based feature extraction
topic Phishing emails
graph attention networks (GAT)
DistilBERT
graph neural network (GNN)
phishing detection
url https://ieeexplore.ieee.org/document/11091285/
work_keys_str_mv AT mejdlsafran phishinggnnphishingemaildetectionusinggraphattentionnetworksandtransformerbasedfeatureextraction
AT abdulbasetmusleh phishinggnnphishingemaildetectionusinggraphattentionnetworksandtransformerbasedfeatureextraction