SBNNR: Small-Size Bat-Optimized KNN Regression

Small datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. For this reason, in som...

Full description

Saved in:
Bibliographic Details
Main Authors: Rasool Seyghaly, Jordi Garcia, Xavi Masip-Bruin, Jovana Kuljanin
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Future Internet
Subjects:
Online Access:https://www.mdpi.com/1999-5903/16/11/422
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850267558411763712
author Rasool Seyghaly
Jordi Garcia
Xavi Masip-Bruin
Jovana Kuljanin
author_facet Rasool Seyghaly
Jordi Garcia
Xavi Masip-Bruin
Jovana Kuljanin
author_sort Rasool Seyghaly
collection DOAJ
description Small datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. For this reason, in some cases, low-performance, overfitting models are developed for small-scale data. As a result, it appears necessary to develop methods for dealing with this type of data. In this research, we provide a new and innovative framework for regression problems with a small sample size. The base of our proposed method is the K-nearest neighbors (KNN) algorithm. For feature selection, instance selection, and hyperparameter tuning, we use the bat optimization algorithm (BA). Generative Adversarial Networks (GANs) are employed to generate synthetic data, effectively addressing the challenges associated with data sparsity. Concurrently, Deep Neural Networks (DNNs), as a deep learning approach, are utilized for feature extraction from both synthetic and real datasets. This hybrid framework integrates KNN, DNN, and GAN as foundational components and is optimized in multiple aspects (features, instances, and hyperparameters) using BA. The outcomes exhibit an enhancement of up to 5% in the coefficient of determination (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>R</mi><mn>2</mn></msup></semantics></math></inline-formula> score) using the proposed method compared to the standard KNN method optimized through grid search.
format Article
id doaj-art-a3cda9a7ecf94658ae2fff7c30fc3549
institution OA Journals
issn 1999-5903
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Future Internet
spelling doaj-art-a3cda9a7ecf94658ae2fff7c30fc35492025-08-20T01:53:44ZengMDPI AGFuture Internet1999-59032024-11-01161142210.3390/fi16110422SBNNR: Small-Size Bat-Optimized KNN RegressionRasool Seyghaly0Jordi Garcia1Xavi Masip-Bruin2Jovana Kuljanin3Advanced Network Architectures Laboratory (CRAAX), Universitat Politècnica de Catalunya (UPC) BarcelonaTECH, 08800 Vilanova, SpainAdvanced Network Architectures Laboratory (CRAAX), Universitat Politècnica de Catalunya (UPC) BarcelonaTECH, 08800 Vilanova, SpainAdvanced Network Architectures Laboratory (CRAAX), Universitat Politècnica de Catalunya (UPC) BarcelonaTECH, 08800 Vilanova, SpainAeronautical Division, Universitat Politècnica de Catalunya BarcelonaTECH, 08034 Barcelona, SpainSmall datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. For this reason, in some cases, low-performance, overfitting models are developed for small-scale data. As a result, it appears necessary to develop methods for dealing with this type of data. In this research, we provide a new and innovative framework for regression problems with a small sample size. The base of our proposed method is the K-nearest neighbors (KNN) algorithm. For feature selection, instance selection, and hyperparameter tuning, we use the bat optimization algorithm (BA). Generative Adversarial Networks (GANs) are employed to generate synthetic data, effectively addressing the challenges associated with data sparsity. Concurrently, Deep Neural Networks (DNNs), as a deep learning approach, are utilized for feature extraction from both synthetic and real datasets. This hybrid framework integrates KNN, DNN, and GAN as foundational components and is optimized in multiple aspects (features, instances, and hyperparameters) using BA. The outcomes exhibit an enhancement of up to 5% in the coefficient of determination (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>R</mi><mn>2</mn></msup></semantics></math></inline-formula> score) using the proposed method compared to the standard KNN method optimized through grid search.https://www.mdpi.com/1999-5903/16/11/422regressionK-nearest neighborbat algorithminstance selectionfeature selection
spellingShingle Rasool Seyghaly
Jordi Garcia
Xavi Masip-Bruin
Jovana Kuljanin
SBNNR: Small-Size Bat-Optimized KNN Regression
Future Internet
regression
K-nearest neighbor
bat algorithm
instance selection
feature selection
title SBNNR: Small-Size Bat-Optimized KNN Regression
title_full SBNNR: Small-Size Bat-Optimized KNN Regression
title_fullStr SBNNR: Small-Size Bat-Optimized KNN Regression
title_full_unstemmed SBNNR: Small-Size Bat-Optimized KNN Regression
title_short SBNNR: Small-Size Bat-Optimized KNN Regression
title_sort sbnnr small size bat optimized knn regression
topic regression
K-nearest neighbor
bat algorithm
instance selection
feature selection
url https://www.mdpi.com/1999-5903/16/11/422
work_keys_str_mv AT rasoolseyghaly sbnnrsmallsizebatoptimizedknnregression
AT jordigarcia sbnnrsmallsizebatoptimizedknnregression
AT xavimasipbruin sbnnrsmallsizebatoptimizedknnregression
AT jovanakuljanin sbnnrsmallsizebatoptimizedknnregression