Analysis of Methods and Tools for Detecting Sensitive Information in Source Code: Issues of Accuracy and Completeness
In the context of widespread adoption of DevOps practices and increasing software system complexity, the issue of sensitive information leakage – such as API keys, passwords, and tokens – directly from source code and configuration files is becoming critically important. Such leaks can result in ser...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | Russian |
| Published: |
The Fund for Promotion of Internet media, IT education, human development «League Internet Media»
2025-04-01
|
| Series: | Современные информационные технологии и IT-образование |
| Subjects: | |
| Online Access: | https://sitito.cs.msu.ru/index.php/SITITO/article/view/1194 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In the context of widespread adoption of DevOps practices and increasing software system complexity, the issue of sensitive information leakage – such as API keys, passwords, and tokens – directly from source code and configuration files is becoming critically important. Such leaks can result in serious security incidents, financial losses, and reputational damage. This paper presents an in-depth analysis of the problem of secrets detection in code. It reviews types of secrets, typical locations of their occurrence, and the risks associated with their compromise. A detailed overview and critical evaluation of current detection methods are provided, including pattern matching with regular expressions, entropy analysis, and basic semantic analysis techniques. The principles, advantages, and significant limitations of each approach are discussed, particularly the issues of false positives (FP) and false negatives (FN). Results of comparative testing of popular open-source tools (Gitleaks, TruffleHog, DeepSecrets) on a dataset of 50 repositories demonstrate substantial variation in detection accuracy and false alert rates. The study concludes that existing approaches are insufficiently effective and highlights the need for more intelligent and accurate solutions for reliable secret detection in source code. |
|---|---|
| ISSN: | 2411-1473 |