Multimodal fusion based few-shot network intrusion detection system
Abstract As network environments become increasingly complex and new attack methods emerge more frequently, the diversity of network attacks continues to grow. Particularly with new or rare attacks, gathering a large number of labeled samples is extremely difficult, resulting in limited training dat...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-05217-4 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849769154754641920 |
|---|---|
| author | Congyuan Xu Yong Zhan Zhiqiang Wang Jun Yang |
| author_facet | Congyuan Xu Yong Zhan Zhiqiang Wang Jun Yang |
| author_sort | Congyuan Xu |
| collection | DOAJ |
| description | Abstract As network environments become increasingly complex and new attack methods emerge more frequently, the diversity of network attacks continues to grow. Particularly with new or rare attacks, gathering a large number of labeled samples is extremely difficult, resulting in limited training data. Existing few-shot learning methods, while reducing reliance on large datasets, mostly handle single-modality data and fail to fully exploit complementary information across different modalities, limiting detection performance. To address this challenge, we introduce a multimodal fusion based few-shot network intrusion detection method that merges traffic feature graphs and network feature sets. Tailored to these modal characteristics, we develop two models: the G-Model and the S-Model. The G-Model employs convolutional neural networks to capture spatial connections in traffic feature graphs, while the S-Model uses the Transformer architecture to process and fuse network feature sets with long-range dependencies. Furthermore, we extensively study the fusion effects of these two modalities at various interaction depths to enhance detection performance. Experimental validation on the CICIDS2017 and CICIDS2018 datasets demonstrates that our method achieves multi-class accuracy rates of 93.40% and 98.50%, respectively, surpassing existing few-shot network intrusion detection methods. |
| format | Article |
| id | doaj-art-7b44d3f8954e4fe3909ee779c41a213f |
| institution | DOAJ |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-7b44d3f8954e4fe3909ee779c41a213f2025-08-20T03:03:33ZengNature PortfolioScientific Reports2045-23222025-07-0115112310.1038/s41598-025-05217-4Multimodal fusion based few-shot network intrusion detection systemCongyuan Xu0Yong Zhan1Zhiqiang Wang2Jun Yang3College of Information Science and Engineering, Jiaxing UniversitySchool of Information Science and Technology, Zhejiang Sci-Tech UniversityCollege of Information Science and Engineering, Jiaxing UniversityCollege of Information Science and Engineering, Jiaxing UniversityAbstract As network environments become increasingly complex and new attack methods emerge more frequently, the diversity of network attacks continues to grow. Particularly with new or rare attacks, gathering a large number of labeled samples is extremely difficult, resulting in limited training data. Existing few-shot learning methods, while reducing reliance on large datasets, mostly handle single-modality data and fail to fully exploit complementary information across different modalities, limiting detection performance. To address this challenge, we introduce a multimodal fusion based few-shot network intrusion detection method that merges traffic feature graphs and network feature sets. Tailored to these modal characteristics, we develop two models: the G-Model and the S-Model. The G-Model employs convolutional neural networks to capture spatial connections in traffic feature graphs, while the S-Model uses the Transformer architecture to process and fuse network feature sets with long-range dependencies. Furthermore, we extensively study the fusion effects of these two modalities at various interaction depths to enhance detection performance. Experimental validation on the CICIDS2017 and CICIDS2018 datasets demonstrates that our method achieves multi-class accuracy rates of 93.40% and 98.50%, respectively, surpassing existing few-shot network intrusion detection methods.https://doi.org/10.1038/s41598-025-05217-4Intrusion detectionNetwork securityMultimodalFew-shotDeep learning |
| spellingShingle | Congyuan Xu Yong Zhan Zhiqiang Wang Jun Yang Multimodal fusion based few-shot network intrusion detection system Scientific Reports Intrusion detection Network security Multimodal Few-shot Deep learning |
| title | Multimodal fusion based few-shot network intrusion detection system |
| title_full | Multimodal fusion based few-shot network intrusion detection system |
| title_fullStr | Multimodal fusion based few-shot network intrusion detection system |
| title_full_unstemmed | Multimodal fusion based few-shot network intrusion detection system |
| title_short | Multimodal fusion based few-shot network intrusion detection system |
| title_sort | multimodal fusion based few shot network intrusion detection system |
| topic | Intrusion detection Network security Multimodal Few-shot Deep learning |
| url | https://doi.org/10.1038/s41598-025-05217-4 |
| work_keys_str_mv | AT congyuanxu multimodalfusionbasedfewshotnetworkintrusiondetectionsystem AT yongzhan multimodalfusionbasedfewshotnetworkintrusiondetectionsystem AT zhiqiangwang multimodalfusionbasedfewshotnetworkintrusiondetectionsystem AT junyang multimodalfusionbasedfewshotnetworkintrusiondetectionsystem |