Phishing detection algorithm based on attention and feature fusion

Phishing has been the primary means utilized by attackers to conduct cyber fraud. As national anti-cyber fraud efforts continue to increase, the technical confrontation of various phishing activities has also escalated, bringing significant pressure to phishing detection work. For instance, current...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG Sirui, YAN Zhiwei, DONG Kejun, YUCHI Xuebiao
Format: Article
Language:English
Published: POSTS&TELECOM PRESS Co., LTD 2024-08-01
Series:网络与信息安全学报
Subjects:
Online Access:http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2024058
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841530434371977216
author ZHANG Sirui
YAN Zhiwei
DONG Kejun
YUCHI Xuebiao
author_facet ZHANG Sirui
YAN Zhiwei
DONG Kejun
YUCHI Xuebiao
author_sort ZHANG Sirui
collection DOAJ
description Phishing has been the primary means utilized by attackers to conduct cyber fraud. As national anti-cyber fraud efforts continue to increase, the technical confrontation of various phishing activities has also escalated, bringing significant pressure to phishing detection work. For instance, current phishing attacks often employ images in place of text and apply small-scale shifts or rotations to high-weight website logo images to evade traditional detection algorithms that rely on text or image features. To address the problem of escalating adversarial phishing technologies, a phishing detection algorithm based on the attention mechanism and feature fusion was proposed, and a hierarchical classification model was established. This model included two stages of fusion involving domain names, web structure, web text, and web icons, capable of effectively countering various technical adversarial strategies employed by attackers. In the first stage, the algorithm leveraged the lightweight characteristics of the machine learning model to pre-recall a subset of suspicious domain names from a multitude of domain names. This was achieved by fusing the structural features of domain names, text, and web pages. In the second stage, based on the candidate subset, the attention mechanism was introduced to enhance the extraction of global text association features between the samples and the counterfeited objects. Additionally, the contrast features between the samples and the icons of the counterfeited objects were intensified, and a deep classification model fusing text and image features was established. The effectiveness of the algorithm was ultimately verified. This hierarchical detection method effectively avoids the extraction of image data from a large number of domain names to be detected, significantly improving detection efficiency while ensuring the accuracy of detection.
format Article
id doaj-art-342b840c25034172be1c6cba4c844064
institution Kabale University
issn 2096-109X
language English
publishDate 2024-08-01
publisher POSTS&TELECOM PRESS Co., LTD
record_format Article
series 网络与信息安全学报
spelling doaj-art-342b840c25034172be1c6cba4c8440642025-01-15T03:04:12ZengPOSTS&TELECOM PRESS Co., LTD网络与信息安全学报2096-109X2024-08-011012313170108240Phishing detection algorithm based on attention and feature fusionZHANG SiruiYAN ZhiweiDONG KejunYUCHI XuebiaoPhishing has been the primary means utilized by attackers to conduct cyber fraud. As national anti-cyber fraud efforts continue to increase, the technical confrontation of various phishing activities has also escalated, bringing significant pressure to phishing detection work. For instance, current phishing attacks often employ images in place of text and apply small-scale shifts or rotations to high-weight website logo images to evade traditional detection algorithms that rely on text or image features. To address the problem of escalating adversarial phishing technologies, a phishing detection algorithm based on the attention mechanism and feature fusion was proposed, and a hierarchical classification model was established. This model included two stages of fusion involving domain names, web structure, web text, and web icons, capable of effectively countering various technical adversarial strategies employed by attackers. In the first stage, the algorithm leveraged the lightweight characteristics of the machine learning model to pre-recall a subset of suspicious domain names from a multitude of domain names. This was achieved by fusing the structural features of domain names, text, and web pages. In the second stage, based on the candidate subset, the attention mechanism was introduced to enhance the extraction of global text association features between the samples and the counterfeited objects. Additionally, the contrast features between the samples and the icons of the counterfeited objects were intensified, and a deep classification model fusing text and image features was established. The effectiveness of the algorithm was ultimately verified. This hierarchical detection method effectively avoids the extraction of image data from a large number of domain names to be detected, significantly improving detection efficiency while ensuring the accuracy of detection.http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2024058phishing detectionhierarchical feature fusionattention mechanism
spellingShingle ZHANG Sirui
YAN Zhiwei
DONG Kejun
YUCHI Xuebiao
Phishing detection algorithm based on attention and feature fusion
网络与信息安全学报
phishing detection
hierarchical feature fusion
attention mechanism
title Phishing detection algorithm based on attention and feature fusion
title_full Phishing detection algorithm based on attention and feature fusion
title_fullStr Phishing detection algorithm based on attention and feature fusion
title_full_unstemmed Phishing detection algorithm based on attention and feature fusion
title_short Phishing detection algorithm based on attention and feature fusion
title_sort phishing detection algorithm based on attention and feature fusion
topic phishing detection
hierarchical feature fusion
attention mechanism
url http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2024058
work_keys_str_mv AT zhangsirui phishingdetectionalgorithmbasedonattentionandfeaturefusion
AT yanzhiwei phishingdetectionalgorithmbasedonattentionandfeaturefusion
AT dongkejun phishingdetectionalgorithmbasedonattentionandfeaturefusion
AT yuchixuebiao phishingdetectionalgorithmbasedonattentionandfeaturefusion