Mining Stack Overflow for API class recommendation using DOC2VEC and LDA

Abstract To address the lexical gaps between natural language (NL) queries and Application Programming Interface (API) documentations, and between NL queries and programme code, this study developed a novel approach for recommending Java API classes that are relevant to the program​ming tasks descri...

Full description

Saved in:
Bibliographic Details
Main Authors: Wai Keat Lee, Moon Ting Su
Format: Article
Language:English
Published: Wiley 2021-10-01
Series:IET Software
Subjects:
Online Access:https://doi.org/10.1049/sfw2.12023
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832559568235790336
author Wai Keat Lee
Moon Ting Su
author_facet Wai Keat Lee
Moon Ting Su
author_sort Wai Keat Lee
collection DOAJ
description Abstract To address the lexical gaps between natural language (NL) queries and Application Programming Interface (API) documentations, and between NL queries and programme code, this study developed a novel approach for recommending Java API classes that are relevant to the program​ming tasks described in NL queries. A Doc2Vec model was trained using question titles mined from Stack Overflow. The model was used to find question titles that are semantically similar to a query. Latent Dirichlet Allocation (LDA) topic modelling was applied on the Java API classes (extracted from code snippets found in the accepted answers of these similar questions) to extract a single topic comprising of the Top‐10 Java API classes that are relevant to the query. The benchmarking of the proposed approach against state‐of‐the‐art approaches, RACK and NLP2API, by using four performance metrics show that it is possible to produce comparable API recommendation results using a less complex approach that makes use of some basic machine learning models, in particular, Doc2Vec and LDA. The approach was implemented in a Java API class recommender with an Eclipse IDE's plug‐in serving as the front‐end.
format Article
id doaj-art-911df404890240d3bc6325e72919fa50
institution Kabale University
issn 1751-8806
1751-8814
language English
publishDate 2021-10-01
publisher Wiley
record_format Article
series IET Software
spelling doaj-art-911df404890240d3bc6325e72919fa502025-02-03T01:29:44ZengWileyIET Software1751-88061751-88142021-10-0115530832210.1049/sfw2.12023Mining Stack Overflow for API class recommendation using DOC2VEC and LDAWai Keat Lee0Moon Ting Su1Department of Software Engineering Faculty of Computer Science and Information Technology, University of Malaya Kuala Lumpur MalaysiaDepartment of Software Engineering Faculty of Computer Science and Information Technology, University of Malaya Kuala Lumpur MalaysiaAbstract To address the lexical gaps between natural language (NL) queries and Application Programming Interface (API) documentations, and between NL queries and programme code, this study developed a novel approach for recommending Java API classes that are relevant to the program​ming tasks described in NL queries. A Doc2Vec model was trained using question titles mined from Stack Overflow. The model was used to find question titles that are semantically similar to a query. Latent Dirichlet Allocation (LDA) topic modelling was applied on the Java API classes (extracted from code snippets found in the accepted answers of these similar questions) to extract a single topic comprising of the Top‐10 Java API classes that are relevant to the query. The benchmarking of the proposed approach against state‐of‐the‐art approaches, RACK and NLP2API, by using four performance metrics show that it is possible to produce comparable API recommendation results using a less complex approach that makes use of some basic machine learning models, in particular, Doc2Vec and LDA. The approach was implemented in a Java API class recommender with an Eclipse IDE's plug‐in serving as the front‐end.https://doi.org/10.1049/sfw2.12023application program interfacesdata miningdocument handlingJavalearning (artificial intelligence)natural language processing
spellingShingle Wai Keat Lee
Moon Ting Su
Mining Stack Overflow for API class recommendation using DOC2VEC and LDA
IET Software
application program interfaces
data mining
document handling
Java
learning (artificial intelligence)
natural language processing
title Mining Stack Overflow for API class recommendation using DOC2VEC and LDA
title_full Mining Stack Overflow for API class recommendation using DOC2VEC and LDA
title_fullStr Mining Stack Overflow for API class recommendation using DOC2VEC and LDA
title_full_unstemmed Mining Stack Overflow for API class recommendation using DOC2VEC and LDA
title_short Mining Stack Overflow for API class recommendation using DOC2VEC and LDA
title_sort mining stack overflow for api class recommendation using doc2vec and lda
topic application program interfaces
data mining
document handling
Java
learning (artificial intelligence)
natural language processing
url https://doi.org/10.1049/sfw2.12023
work_keys_str_mv AT waikeatlee miningstackoverflowforapiclassrecommendationusingdoc2vecandlda
AT moontingsu miningstackoverflowforapiclassrecommendationusingdoc2vecandlda