Multimodal fusion based few-shot network intrusion detection system

Abstract As network environments become increasingly complex and new attack methods emerge more frequently, the diversity of network attacks continues to grow. Particularly with new or rare attacks, gathering a large number of labeled samples is extremely difficult, resulting in limited training dat...

Full description

Saved in:
Bibliographic Details
Main Authors: Congyuan Xu, Yong Zhan, Zhiqiang Wang, Jun Yang
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-05217-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849769154754641920
author Congyuan Xu
Yong Zhan
Zhiqiang Wang
Jun Yang
author_facet Congyuan Xu
Yong Zhan
Zhiqiang Wang
Jun Yang
author_sort Congyuan Xu
collection DOAJ
description Abstract As network environments become increasingly complex and new attack methods emerge more frequently, the diversity of network attacks continues to grow. Particularly with new or rare attacks, gathering a large number of labeled samples is extremely difficult, resulting in limited training data. Existing few-shot learning methods, while reducing reliance on large datasets, mostly handle single-modality data and fail to fully exploit complementary information across different modalities, limiting detection performance. To address this challenge, we introduce a multimodal fusion based few-shot network intrusion detection method that merges traffic feature graphs and network feature sets. Tailored to these modal characteristics, we develop two models: the G-Model and the S-Model. The G-Model employs convolutional neural networks to capture spatial connections in traffic feature graphs, while the S-Model uses the Transformer architecture to process and fuse network feature sets with long-range dependencies. Furthermore, we extensively study the fusion effects of these two modalities at various interaction depths to enhance detection performance. Experimental validation on the CICIDS2017 and CICIDS2018 datasets demonstrates that our method achieves multi-class accuracy rates of 93.40% and 98.50%, respectively, surpassing existing few-shot network intrusion detection methods.
format Article
id doaj-art-7b44d3f8954e4fe3909ee779c41a213f
institution DOAJ
issn 2045-2322
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-7b44d3f8954e4fe3909ee779c41a213f2025-08-20T03:03:33ZengNature PortfolioScientific Reports2045-23222025-07-0115112310.1038/s41598-025-05217-4Multimodal fusion based few-shot network intrusion detection systemCongyuan Xu0Yong Zhan1Zhiqiang Wang2Jun Yang3College of Information Science and Engineering, Jiaxing UniversitySchool of Information Science and Technology, Zhejiang Sci-Tech UniversityCollege of Information Science and Engineering, Jiaxing UniversityCollege of Information Science and Engineering, Jiaxing UniversityAbstract As network environments become increasingly complex and new attack methods emerge more frequently, the diversity of network attacks continues to grow. Particularly with new or rare attacks, gathering a large number of labeled samples is extremely difficult, resulting in limited training data. Existing few-shot learning methods, while reducing reliance on large datasets, mostly handle single-modality data and fail to fully exploit complementary information across different modalities, limiting detection performance. To address this challenge, we introduce a multimodal fusion based few-shot network intrusion detection method that merges traffic feature graphs and network feature sets. Tailored to these modal characteristics, we develop two models: the G-Model and the S-Model. The G-Model employs convolutional neural networks to capture spatial connections in traffic feature graphs, while the S-Model uses the Transformer architecture to process and fuse network feature sets with long-range dependencies. Furthermore, we extensively study the fusion effects of these two modalities at various interaction depths to enhance detection performance. Experimental validation on the CICIDS2017 and CICIDS2018 datasets demonstrates that our method achieves multi-class accuracy rates of 93.40% and 98.50%, respectively, surpassing existing few-shot network intrusion detection methods.https://doi.org/10.1038/s41598-025-05217-4Intrusion detectionNetwork securityMultimodalFew-shotDeep learning
spellingShingle Congyuan Xu
Yong Zhan
Zhiqiang Wang
Jun Yang
Multimodal fusion based few-shot network intrusion detection system
Scientific Reports
Intrusion detection
Network security
Multimodal
Few-shot
Deep learning
title Multimodal fusion based few-shot network intrusion detection system
title_full Multimodal fusion based few-shot network intrusion detection system
title_fullStr Multimodal fusion based few-shot network intrusion detection system
title_full_unstemmed Multimodal fusion based few-shot network intrusion detection system
title_short Multimodal fusion based few-shot network intrusion detection system
title_sort multimodal fusion based few shot network intrusion detection system
topic Intrusion detection
Network security
Multimodal
Few-shot
Deep learning
url https://doi.org/10.1038/s41598-025-05217-4
work_keys_str_mv AT congyuanxu multimodalfusionbasedfewshotnetworkintrusiondetectionsystem
AT yongzhan multimodalfusionbasedfewshotnetworkintrusiondetectionsystem
AT zhiqiangwang multimodalfusionbasedfewshotnetworkintrusiondetectionsystem
AT junyang multimodalfusionbasedfewshotnetworkintrusiondetectionsystem