CodeGuard: enhancing accuracy in detecting clones within java source code

Detecting code clones remains challenging, particularly for Type-II clones, with modified identifiers, and Type-III ST and MT clones, where up to 30% and 50% of code, respectively, are added or removed from the original clone code. To address this, we introduce CodeGuard, an innovative technique tha...

Full description

Saved in:
Bibliographic Details
Main Authors: Yasir Glani, Luo Ping
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-12-01
Series:Frontiers in Computer Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcomp.2024.1455860/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850108270318977024
author Yasir Glani
Luo Ping
author_facet Yasir Glani
Luo Ping
author_sort Yasir Glani
collection DOAJ
description Detecting code clones remains challenging, particularly for Type-II clones, with modified identifiers, and Type-III ST and MT clones, where up to 30% and 50% of code, respectively, are added or removed from the original clone code. To address this, we introduce CodeGuard, an innovative technique that employs comprehensive level-by-level abstraction for Type-II clones and a flexible signature matching algorithm for Type-III clone categories. This method requires at least 50% similarity within two corresponding chunks within the same file, ensuring accurate clone identification. Unlike recently proposed methods limited to clone detection, CodeGuard precisely pinpoints changes within clone files, facilitating effective debugging and thorough code analysis. It is validated through comprehensive evaluations using reputable datasets, CodeGuard demonstrates superior precision, high recall, robust F1 scores, and outstanding accuracy. This innovative methodology not only sets new performance standards in clone detection but also emphasizes the role CodeGuard's can play in modern software development, paving the way for advancements in code quality and maintenance.
format Article
id doaj-art-d540ef3a592f443c80281a8a6d7512a4
institution OA Journals
issn 2624-9898
language English
publishDate 2024-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Computer Science
spelling doaj-art-d540ef3a592f443c80281a8a6d7512a42025-08-20T02:38:25ZengFrontiers Media S.A.Frontiers in Computer Science2624-98982024-12-01610.3389/fcomp.2024.14558601455860CodeGuard: enhancing accuracy in detecting clones within java source codeYasir GlaniLuo PingDetecting code clones remains challenging, particularly for Type-II clones, with modified identifiers, and Type-III ST and MT clones, where up to 30% and 50% of code, respectively, are added or removed from the original clone code. To address this, we introduce CodeGuard, an innovative technique that employs comprehensive level-by-level abstraction for Type-II clones and a flexible signature matching algorithm for Type-III clone categories. This method requires at least 50% similarity within two corresponding chunks within the same file, ensuring accurate clone identification. Unlike recently proposed methods limited to clone detection, CodeGuard precisely pinpoints changes within clone files, facilitating effective debugging and thorough code analysis. It is validated through comprehensive evaluations using reputable datasets, CodeGuard demonstrates superior precision, high recall, robust F1 scores, and outstanding accuracy. This innovative methodology not only sets new performance standards in clone detection but also emphasizes the role CodeGuard's can play in modern software development, paving the way for advancements in code quality and maintenance.https://www.frontiersin.org/articles/10.3389/fcomp.2024.1455860/fullcode clone detectionclone identificationsoftware reliabilitycode quality assurancesoftware reuse
spellingShingle Yasir Glani
Luo Ping
CodeGuard: enhancing accuracy in detecting clones within java source code
Frontiers in Computer Science
code clone detection
clone identification
software reliability
code quality assurance
software reuse
title CodeGuard: enhancing accuracy in detecting clones within java source code
title_full CodeGuard: enhancing accuracy in detecting clones within java source code
title_fullStr CodeGuard: enhancing accuracy in detecting clones within java source code
title_full_unstemmed CodeGuard: enhancing accuracy in detecting clones within java source code
title_short CodeGuard: enhancing accuracy in detecting clones within java source code
title_sort codeguard enhancing accuracy in detecting clones within java source code
topic code clone detection
clone identification
software reliability
code quality assurance
software reuse
url https://www.frontiersin.org/articles/10.3389/fcomp.2024.1455860/full
work_keys_str_mv AT yasirglani codeguardenhancingaccuracyindetectingcloneswithinjavasourcecode
AT luoping codeguardenhancingaccuracyindetectingcloneswithinjavasourcecode