Program semantic analysis model for code reuse detection
Program similarity analysis had a wide range of applications in areas such as code plagiarism and property protection, but it generally suffered from problems such as excessive computational overhead, a code similarity analysis method based on fuzzy matching and statistical inference was proposed. F...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2024-12-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024269/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832595480699207680 |
---|---|
author | GUO Xi WANG Pan |
author_facet | GUO Xi WANG Pan |
author_sort | GUO Xi |
collection | DOAJ |
description | Program similarity analysis had a wide range of applications in areas such as code plagiarism and property protection, but it generally suffered from problems such as excessive computational overhead, a code similarity analysis method based on fuzzy matching and statistical inference was proposed. For binary programs, first disassembly analysis was performed and then function boundary recognition operations was performed to extract the execution boundary information of the function. On this basis, dynamic programming analysis methods were used to obtain similarity results between basic blocks at the granularity of the basic blocks, and neighborhood search was performed on the basis of the control flow graph to extend similarity analysis from the basic block level to the function level. Finally, the semantic similarity of binary files was obtained through statistical analysis of similarity functions. During this process, the pre trained model was optimized and analyzed, and the parameters were tuned to enable similarity analysis of cross platform code. The experimental results show that the proposed method has a significant improvement in analysis accuracy compared to traditional analysis tools, with an average increase of 7.1% in analysis accuracy compared to current mainstream analysis tools. |
format | Article |
id | doaj-art-874549224ede4749b9bd7443d27fc965 |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2024-12-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-874549224ede4749b9bd7443d27fc9652025-01-18T19:00:09ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2024-12-014517919680268935Program semantic analysis model for code reuse detectionGUO XiWANG PanProgram similarity analysis had a wide range of applications in areas such as code plagiarism and property protection, but it generally suffered from problems such as excessive computational overhead, a code similarity analysis method based on fuzzy matching and statistical inference was proposed. For binary programs, first disassembly analysis was performed and then function boundary recognition operations was performed to extract the execution boundary information of the function. On this basis, dynamic programming analysis methods were used to obtain similarity results between basic blocks at the granularity of the basic blocks, and neighborhood search was performed on the basis of the control flow graph to extend similarity analysis from the basic block level to the function level. Finally, the semantic similarity of binary files was obtained through statistical analysis of similarity functions. During this process, the pre trained model was optimized and analyzed, and the parameters were tuned to enable similarity analysis of cross platform code. The experimental results show that the proposed method has a significant improvement in analysis accuracy compared to traditional analysis tools, with an average increase of 7.1% in analysis accuracy compared to current mainstream analysis tools.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024269/program analysisfuzzy matchingstatistical inferencemachine learning |
spellingShingle | GUO Xi WANG Pan Program semantic analysis model for code reuse detection Tongxin xuebao program analysis fuzzy matching statistical inference machine learning |
title | Program semantic analysis model for code reuse detection |
title_full | Program semantic analysis model for code reuse detection |
title_fullStr | Program semantic analysis model for code reuse detection |
title_full_unstemmed | Program semantic analysis model for code reuse detection |
title_short | Program semantic analysis model for code reuse detection |
title_sort | program semantic analysis model for code reuse detection |
topic | program analysis fuzzy matching statistical inference machine learning |
url | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024269/ |
work_keys_str_mv | AT guoxi programsemanticanalysismodelforcodereusedetection AT wangpan programsemanticanalysismodelforcodereusedetection |