FedITD: A Federated Parameter-Efficient Tuning With Pre-Trained Large Language Models and Transfer Learning Framework for Insider Threat Detection

Insider threats cause greater losses than external attacks, prompting organizations to invest in detection systems. However, there exist challenges: 1) Security and privacy concerns prevent data sharing, making it difficult to train robust models and identify new attacks. 2) The diversity and unique...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhi Qiang Wang, Haopeng Wang, Abdulmotaleb El Saddik
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10721229/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Insider threats cause greater losses than external attacks, prompting organizations to invest in detection systems. However, there exist challenges: 1) Security and privacy concerns prevent data sharing, making it difficult to train robust models and identify new attacks. 2) The diversity and uniqueness of organizations require localized models, as a universal solution could be more effective. 3) High resource costs, delays, and data security concerns complicate building effective detection systems. This paper introduces FedITD, a flexible, hierarchy, and federated framework with local real-time detection systems, combining Large Language Models (LLM), Federated Learning (FL), Parameter Efficient Tuning (PETuning), and Transfer Learning (TF) for insider threat detection. FedITD uses FL to protect privacy while indirect integrating client information and employs PETuning methods (Adapter, BitFit, LoRA) with LLMs (BERT, RoBERTa, XLNet, DistilBERT) to reduce resource use and time delay. FedITD customizes client models and optimizes performance via transfer learning without central data transfer, further enhancing the detection of new attacks. FedITD outperforms other federated learning methods and its performance is very close to the best centrally trained method. Extensive experiment results show FedITD’s superior performance, adaptability to varied data, and reduction of resource costs, achieving an optimal balance in detection capabilities across source data, unlabeled local data, and global data. Alternative PETuning implementations are also explored in this paper.
ISSN:2169-3536