Personal Data Recognition Using a Deep Learning Model

Protecting personal identifiable information is a crucial issue today due to individuals leaving traces of their activities on social media and various digital platforms, which can be exploited by attackers for identity theft and fraud. Consequently, there is a need to develop effective methods for...

Full description

Saved in:
Bibliographic Details
Main Author: Nikita Babak
Format: Article
Language:Russian
Published: The Fund for Promotion of Internet media, IT education, human development «League Internet Media» 2024-03-01
Series:Современные информационные технологии и IT-образование
Subjects:
Online Access:http://sitito.cs.msu.ru/index.php/SITITO/article/view/1119
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Protecting personal identifiable information is a crucial issue today due to individuals leaving traces of their activities on social media and various digital platforms, which can be exploited by attackers for identity theft and fraud. Consequently, there is a need to develop effective methods for personal data protection. However, recognizing personal data for protection presents a significant challenge, given the diverse nature of personal data attributes, such as names and phone numbers, which can be present in various formats like tables or unstructured texts. To address this challenge, a range of techniques are employed for personal data recognition, with rule-based algorithms being the most used approach. These algorithms enable the identification of personalized data based on predefined rules, such as regular expressions and dictionaries. Nevertheless, such algorithms may lack the flexibility required to handle complex cases effectively. An alternative method involves the use of deep learning models, which are trained on large datasets and possess the capacity to adapt to diverse forms of data. In this paper, deep learning models featuring different neural network architectures were implemented and compared against rule-based algorithms. Additionally, the feasibility of using the Large Language Model for personal data recognition was explored. The research culminated in the development of a personal data recognition method that combines Artificial Intelligence language model with rule-based algorithms, capable of identifying personal data in structured and unstructured information. This paper underscores the imperative of personal data protection and highlights the potential of Artificial Intelligence models in mitigating this issue.
ISSN:2411-1473