XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detection

With the proliferation of web applications, cross-site scripting (XSS) attacks have increased significantly and now pose a significant threat to users' information security and privacy. To enhance the efficiency of XSS attack detection, the adoption of machine learning (ML) and deep learning (D...

Full description

Saved in:
Bibliographic Details
Main Authors: Gia-Huy Luu, Minh-Khang Duong, Trong-Phuc Pham-Ngo, Thanh-Sang Ngo, Dat-Thinh Nguyen, Xuan-Ha Nguyen, Kim-Hung Le
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Results in Engineering
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590123024016165
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850122966164045824
author Gia-Huy Luu
Minh-Khang Duong
Trong-Phuc Pham-Ngo
Thanh-Sang Ngo
Dat-Thinh Nguyen
Xuan-Ha Nguyen
Kim-Hung Le
author_facet Gia-Huy Luu
Minh-Khang Duong
Trong-Phuc Pham-Ngo
Thanh-Sang Ngo
Dat-Thinh Nguyen
Xuan-Ha Nguyen
Kim-Hung Le
author_sort Gia-Huy Luu
collection DOAJ
description With the proliferation of web applications, cross-site scripting (XSS) attacks have increased significantly and now pose a significant threat to users' information security and privacy. To enhance the efficiency of XSS attack detection, the adoption of machine learning (ML) and deep learning (DL) techniques offers promising solutions, but their effectiveness is limited by the lack of comprehensive and diverse datasets. Moreover, existing approaches often prioritize detection accuracy over real-time processing capabilities, which are essential for effective defense. To address these challenges, in this paper, we propose a novel framework that automatically collects web resources, efficiently extracts informative features, and constructs an up-to-date XSS attack dataset, which is then used to train a machine learning-based XSS detection model. Using this framework, we created and published a well-structured dataset over 100,000 samples for the research community. Furthermore, we present a hybrid detection model that leverages the strengths of both Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks. Extensive evaluations of our dataset demonstrate that the proposed model outperforms other baseline ML models across various metrics, including processing rate. Notably, our model achieves an accuracy of 99.27% while maintaining a low false positive rate of 0.06% and high processing rate of exceeding 1000 samples per second. These results highlight its high accuracy and robustness in detecting XSS, and suitability for real-time applications. Our work presents a comprehensive solution for enhancing web application security by providing a diverse dataset and a high-accuracy detection model with low latency.
format Article
id doaj-art-97f2f7e55bdb4960b2f73e84c4214f25
institution OA Journals
issn 2590-1230
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Results in Engineering
spelling doaj-art-97f2f7e55bdb4960b2f73e84c4214f252025-08-20T02:34:43ZengElsevierResults in Engineering2590-12302024-12-012410336310.1016/j.rineng.2024.103363XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detectionGia-Huy Luu0Minh-Khang Duong1Trong-Phuc Pham-Ngo2Thanh-Sang Ngo3Dat-Thinh Nguyen4Xuan-Ha Nguyen5Kim-Hung Le6University of Information Technology, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet NamUniversity of Information Technology, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet NamUniversity of Information Technology, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet NamUniversity of Information Technology, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet NamUniversity of Information Technology, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet NamUniversity of Information Technology, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet NamUniversity of Information Technology, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam; Corresponding author at: University of Information Technology, Vietnam National University, Ho Chi Minh City, Viet Nam.With the proliferation of web applications, cross-site scripting (XSS) attacks have increased significantly and now pose a significant threat to users' information security and privacy. To enhance the efficiency of XSS attack detection, the adoption of machine learning (ML) and deep learning (DL) techniques offers promising solutions, but their effectiveness is limited by the lack of comprehensive and diverse datasets. Moreover, existing approaches often prioritize detection accuracy over real-time processing capabilities, which are essential for effective defense. To address these challenges, in this paper, we propose a novel framework that automatically collects web resources, efficiently extracts informative features, and constructs an up-to-date XSS attack dataset, which is then used to train a machine learning-based XSS detection model. Using this framework, we created and published a well-structured dataset over 100,000 samples for the research community. Furthermore, we present a hybrid detection model that leverages the strengths of both Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks. Extensive evaluations of our dataset demonstrate that the proposed model outperforms other baseline ML models across various metrics, including processing rate. Notably, our model achieves an accuracy of 99.27% while maintaining a low false positive rate of 0.06% and high processing rate of exceeding 1000 samples per second. These results highlight its high accuracy and robustness in detecting XSS, and suitability for real-time applications. Our work presents a comprehensive solution for enhancing web application security by providing a diverse dataset and a high-accuracy detection model with low latency.http://www.sciencedirect.com/science/article/pii/S2590123024016165XSS detectionDeep learningCNN-LSTM modelWeb security
spellingShingle Gia-Huy Luu
Minh-Khang Duong
Trong-Phuc Pham-Ngo
Thanh-Sang Ngo
Dat-Thinh Nguyen
Xuan-Ha Nguyen
Kim-Hung Le
XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detection
Results in Engineering
XSS detection
Deep learning
CNN-LSTM model
Web security
title XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detection
title_full XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detection
title_fullStr XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detection
title_full_unstemmed XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detection
title_short XSShield: A novel dataset and lightweight hybrid deep learning model for XSS attack detection
title_sort xsshield a novel dataset and lightweight hybrid deep learning model for xss attack detection
topic XSS detection
Deep learning
CNN-LSTM model
Web security
url http://www.sciencedirect.com/science/article/pii/S2590123024016165
work_keys_str_mv AT giahuyluu xsshieldanoveldatasetandlightweighthybriddeeplearningmodelforxssattackdetection
AT minhkhangduong xsshieldanoveldatasetandlightweighthybriddeeplearningmodelforxssattackdetection
AT trongphucphamngo xsshieldanoveldatasetandlightweighthybriddeeplearningmodelforxssattackdetection
AT thanhsangngo xsshieldanoveldatasetandlightweighthybriddeeplearningmodelforxssattackdetection
AT datthinhnguyen xsshieldanoveldatasetandlightweighthybriddeeplearningmodelforxssattackdetection
AT xuanhanguyen xsshieldanoveldatasetandlightweighthybriddeeplearningmodelforxssattackdetection
AT kimhungle xsshieldanoveldatasetandlightweighthybriddeeplearningmodelforxssattackdetection