Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph

As the forms of cyber threats become increasingly severe, cybersecurity knowledge graphs (KGs) have become essential tools for understanding and mitigating these threats. However, the quality of the KG is critical to its effectiveness in cybersecurity applications. In this paper, we propose a spurio...

Full description

Saved in:
Bibliographic Details
Main Authors: Bin Chen, Hongyi Li, Ze Shi
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/1/68
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549176205213696
author Bin Chen
Hongyi Li
Ze Shi
author_facet Bin Chen
Hongyi Li
Ze Shi
author_sort Bin Chen
collection DOAJ
description As the forms of cyber threats become increasingly severe, cybersecurity knowledge graphs (KGs) have become essential tools for understanding and mitigating these threats. However, the quality of the KG is critical to its effectiveness in cybersecurity applications. In this paper, we propose a spurious-negative sample augmentation-based quality evaluation method for cybersecurity KGs (SNAQE) that includes two key modules: the multi-scale spurious-negative triple detection module and the adaptive mixup based on the attention mechanism module. The multi-scale spurious-negative triple detection module classifies the sampled negative triples into spurious-negative and true-negative triples. Subsequently, the attention mechanism-based adaptive mixup module selects appropriate mixup targets for each spurious-negative triple, constructing partially correct triples and achieving more precise sample generation in the entity embedding space to assist in training the KG quality evaluation models. Through extensive experimental validation, the SNAQE model not only performs excellently in general-domain KG quality evaluation but also achieves outstanding outcomes in the cybersecurity KGs, significantly enhancing the accuracy and F1 score of the model, with the best F1 score of 0.969 achieved on the FB15K dataset.
format Article
id doaj-art-c58cea352a524fee8920205adea21bad
institution Kabale University
issn 2227-7390
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-c58cea352a524fee8920205adea21bad2025-01-10T13:18:09ZengMDPI AGMathematics2227-73902024-12-011316810.3390/math13010068Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge GraphBin Chen0Hongyi Li1Ze Shi2School of Cyber Science and Technology, Beihang University, Beijing 100191, ChinaSchool of Cyber Science and Technology, Beihang University, Beijing 100191, ChinaSchool of Cyber Science and Technology, Beihang University, Beijing 100191, ChinaAs the forms of cyber threats become increasingly severe, cybersecurity knowledge graphs (KGs) have become essential tools for understanding and mitigating these threats. However, the quality of the KG is critical to its effectiveness in cybersecurity applications. In this paper, we propose a spurious-negative sample augmentation-based quality evaluation method for cybersecurity KGs (SNAQE) that includes two key modules: the multi-scale spurious-negative triple detection module and the adaptive mixup based on the attention mechanism module. The multi-scale spurious-negative triple detection module classifies the sampled negative triples into spurious-negative and true-negative triples. Subsequently, the attention mechanism-based adaptive mixup module selects appropriate mixup targets for each spurious-negative triple, constructing partially correct triples and achieving more precise sample generation in the entity embedding space to assist in training the KG quality evaluation models. Through extensive experimental validation, the SNAQE model not only performs excellently in general-domain KG quality evaluation but also achieves outstanding outcomes in the cybersecurity KGs, significantly enhancing the accuracy and F1 score of the model, with the best F1 score of 0.969 achieved on the FB15K dataset.https://www.mdpi.com/2227-7390/13/1/68cybersecurityquality evaluationspurious-negative tripleknowledge graphs
spellingShingle Bin Chen
Hongyi Li
Ze Shi
Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph
Mathematics
cybersecurity
quality evaluation
spurious-negative triple
knowledge graphs
title Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph
title_full Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph
title_fullStr Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph
title_full_unstemmed Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph
title_short Research on Spurious-Negative Sample Augmentation-Based Quality Evaluation Method for Cybersecurity Knowledge Graph
title_sort research on spurious negative sample augmentation based quality evaluation method for cybersecurity knowledge graph
topic cybersecurity
quality evaluation
spurious-negative triple
knowledge graphs
url https://www.mdpi.com/2227-7390/13/1/68
work_keys_str_mv AT binchen researchonspuriousnegativesampleaugmentationbasedqualityevaluationmethodforcybersecurityknowledgegraph
AT hongyili researchonspuriousnegativesampleaugmentationbasedqualityevaluationmethodforcybersecurityknowledgegraph
AT zeshi researchonspuriousnegativesampleaugmentationbasedqualityevaluationmethodforcybersecurityknowledgegraph