A Deep Learning Approach for a Source Code Detection Model Using Self-Attention

With the development of deep learning, many approaches based on neural networks are proposed for code clone. In this paper, we propose a novel source code detection model At-biLSTM based on a bidirectional LSTM network with a self-attention layer. At-biLSTM is composed of a representation model and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yao Meng, Long Liu
Format:	Article
Language:	English
Published:	Wiley 2020-01-01
Series:	Complexity
Online Access:	http://dx.doi.org/10.1155/2020/5027198
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849397744747151360
author	Yao Meng Long Liu
author_facet	Yao Meng Long Liu
author_sort	Yao Meng
collection	DOAJ
description	With the development of deep learning, many approaches based on neural networks are proposed for code clone. In this paper, we propose a novel source code detection model At-biLSTM based on a bidirectional LSTM network with a self-attention layer. At-biLSTM is composed of a representation model and a discriminative model. The representation model firstly transforms the source code into an abstract syntactic tree and splits it into a sequence of statement trees; then, it encodes each of the statement trees with a deep-first traversal algorithm. Finally, the representation model encodes the sequence of statement vectors via a bidirectional LSTM network, which is a classical deep learning framework, with a self-attention layer and outputs a vector representing the given source code. The discriminative model identifies the code clone depending on the vectors generated by the presentation model. Our proposed model retains both the syntactics and semantics of the source code in the process of encoding, and the self-attention algorithm makes the classifier concentrate on the effect of key statements and improves the classification performance. The contrast experiments on the benchmarks OJClone and BigCloneBench indicate that At-LSTM is effective and outperforms the state-of-art approaches in source code clone detection.
format	Article
id	doaj-art-a8882994dfc040c8be3d2f74a2cc9ca6
institution	Kabale University
issn	1076-2787 1099-0526
language	English
publishDate	2020-01-01
publisher	Wiley
record_format	Article
series	Complexity
spelling	doaj-art-a8882994dfc040c8be3d2f74a2cc9ca62025-08-20T03:38:54ZengWileyComplexity1076-27871099-05262020-01-01202010.1155/2020/50271985027198A Deep Learning Approach for a Source Code Detection Model Using Self-AttentionYao Meng0Long Liu1State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, ChinaState Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, ChinaWith the development of deep learning, many approaches based on neural networks are proposed for code clone. In this paper, we propose a novel source code detection model At-biLSTM based on a bidirectional LSTM network with a self-attention layer. At-biLSTM is composed of a representation model and a discriminative model. The representation model firstly transforms the source code into an abstract syntactic tree and splits it into a sequence of statement trees; then, it encodes each of the statement trees with a deep-first traversal algorithm. Finally, the representation model encodes the sequence of statement vectors via a bidirectional LSTM network, which is a classical deep learning framework, with a self-attention layer and outputs a vector representing the given source code. The discriminative model identifies the code clone depending on the vectors generated by the presentation model. Our proposed model retains both the syntactics and semantics of the source code in the process of encoding, and the self-attention algorithm makes the classifier concentrate on the effect of key statements and improves the classification performance. The contrast experiments on the benchmarks OJClone and BigCloneBench indicate that At-LSTM is effective and outperforms the state-of-art approaches in source code clone detection.http://dx.doi.org/10.1155/2020/5027198
spellingShingle	Yao Meng Long Liu A Deep Learning Approach for a Source Code Detection Model Using Self-Attention Complexity
title	A Deep Learning Approach for a Source Code Detection Model Using Self-Attention
title_full	A Deep Learning Approach for a Source Code Detection Model Using Self-Attention
title_fullStr	A Deep Learning Approach for a Source Code Detection Model Using Self-Attention
title_full_unstemmed	A Deep Learning Approach for a Source Code Detection Model Using Self-Attention
title_short	A Deep Learning Approach for a Source Code Detection Model Using Self-Attention
title_sort	deep learning approach for a source code detection model using self attention
url	http://dx.doi.org/10.1155/2020/5027198
work_keys_str_mv	AT yaomeng adeeplearningapproachforasourcecodedetectionmodelusingselfattention AT longliu adeeplearningapproachforasourcecodedetectionmodelusingselfattention AT yaomeng deeplearningapproachforasourcecodedetectionmodelusingselfattention AT longliu deeplearningapproachforasourcecodedetectionmodelusingselfattention

A Deep Learning Approach for a Source Code Detection Model Using Self-Attention

Similar Items