CodeGuard: enhancing accuracy in detecting clones within java source code

Detecting code clones remains challenging, particularly for Type-II clones, with modified identifiers, and Type-III ST and MT clones, where up to 30% and 50% of code, respectively, are added or removed from the original clone code. To address this, we introduce CodeGuard, an innovative technique tha...

Full description

Saved in:
Bibliographic Details
Main Authors: Yasir Glani, Luo Ping
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-12-01
Series:Frontiers in Computer Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fcomp.2024.1455860/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Detecting code clones remains challenging, particularly for Type-II clones, with modified identifiers, and Type-III ST and MT clones, where up to 30% and 50% of code, respectively, are added or removed from the original clone code. To address this, we introduce CodeGuard, an innovative technique that employs comprehensive level-by-level abstraction for Type-II clones and a flexible signature matching algorithm for Type-III clone categories. This method requires at least 50% similarity within two corresponding chunks within the same file, ensuring accurate clone identification. Unlike recently proposed methods limited to clone detection, CodeGuard precisely pinpoints changes within clone files, facilitating effective debugging and thorough code analysis. It is validated through comprehensive evaluations using reputable datasets, CodeGuard demonstrates superior precision, high recall, robust F1 scores, and outstanding accuracy. This innovative methodology not only sets new performance standards in clone detection but also emphasizes the role CodeGuard's can play in modern software development, paving the way for advancements in code quality and maintenance.
ISSN:2624-9898