Analysis of Methods and Tools for Detecting Sensitive Information in Source Code: Issues of Accuracy and Completeness

In the context of widespread adoption of DevOps practices and increasing software system complexity, the issue of sensitive information leakage – such as API keys, passwords, and tokens – directly from source code and configuration files is becoming critically important. Such leaks can result in ser...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sergey Lebed, Sophiia Ibragimova
Format:	Article
Language:	Russian
Published:	The Fund for Promotion of Internet media, IT education, human development «League Internet Media» 2025-04-01
Series:	Современные информационные технологии и IT-образование
Subjects:	secret detection sensitive information source code security static code analysis sast gitleaks trufflehog deepsecrets regular expressions shannon entropy false positives information security devsecops
Online Access:	https://sitito.cs.msu.ru/index.php/SITITO/article/view/1194
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In the context of widespread adoption of DevOps practices and increasing software system complexity, the issue of sensitive information leakage – such as API keys, passwords, and tokens – directly from source code and configuration files is becoming critically important. Such leaks can result in serious security incidents, financial losses, and reputational damage. This paper presents an in-depth analysis of the problem of secrets detection in code. It reviews types of secrets, typical locations of their occurrence, and the risks associated with their compromise. A detailed overview and critical evaluation of current detection methods are provided, including pattern matching with regular expressions, entropy analysis, and basic semantic analysis techniques. The principles, advantages, and significant limitations of each approach are discussed, particularly the issues of false positives (FP) and false negatives (FN). Results of comparative testing of popular open-source tools (Gitleaks, TruffleHog, DeepSecrets) on a dataset of 50 repositories demonstrate substantial variation in detection accuracy and false alert rates. The study concludes that existing approaches are insufficiently effective and highlights the need for more intelligent and accurate solutions for reliable secret detection in source code.
ISSN:	2411-1473

Analysis of Methods and Tools for Detecting Sensitive Information in Source Code: Issues of Accuracy and Completeness

Similar Items