A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data
High-dimensional datasets, where the number of features far exceeds the number of observations, present significant challenges in feature selection and model performance. This study proposes a novel two-stage feature-selection approach that integrates Artificial Bee Colony (ABC) optimization with Ad...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-12-01
|
| Series: | AppliedMath |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2673-9909/4/4/81 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850050443556683776 |
|---|---|
| author | Efe Precious Onakpojeruo Nuriye Sancar |
| author_facet | Efe Precious Onakpojeruo Nuriye Sancar |
| author_sort | Efe Precious Onakpojeruo |
| collection | DOAJ |
| description | High-dimensional datasets, where the number of features far exceeds the number of observations, present significant challenges in feature selection and model performance. This study proposes a novel two-stage feature-selection approach that integrates Artificial Bee Colony (ABC) optimization with Adaptive Least Absolute Shrinkage and Selection Operator (AD_LASSO). The initial stage reduces dimensionality while effectively dealing with complex, high-dimensional search spaces by using ABC to conduct a global search for the ideal subset of features. The second stage applies AD_LASSO, refining the selected features by eliminating redundant features and enhancing model interpretability. The proposed ABC-ADLASSO method was compared with the AD_LASSO, LASSO, stepwise, and LARS methods under different simulation settings in high-dimensional data and various real datasets. According to the results obtained from simulations and applications on various real datasets, ABC-ADLASSO has shown significantly superior performance in terms of accuracy, precision, and overall model performance, particularly in scenarios with high correlation and a large number of features compared to the other methods evaluated. This two-stage approach offers robust feature selection and improves predictive accuracy, making it an effective tool for analyzing high-dimensional data. |
| format | Article |
| id | doaj-art-18160c3b9e964f81a4b1a2bb248cda4c |
| institution | DOAJ |
| issn | 2673-9909 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | AppliedMath |
| spelling | doaj-art-18160c3b9e964f81a4b1a2bb248cda4c2025-08-20T02:53:27ZengMDPI AGAppliedMath2673-99092024-12-01441522153810.3390/appliedmath4040081A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional DataEfe Precious Onakpojeruo0Nuriye Sancar1Operational Research Center in Healthcare, Near East University, TRNC Mersin 10, Nicosia 99138, TurkeyDepartment of Mathematics, Near East University, TRNC Mersin 10, Nicosia 99138, TurkeyHigh-dimensional datasets, where the number of features far exceeds the number of observations, present significant challenges in feature selection and model performance. This study proposes a novel two-stage feature-selection approach that integrates Artificial Bee Colony (ABC) optimization with Adaptive Least Absolute Shrinkage and Selection Operator (AD_LASSO). The initial stage reduces dimensionality while effectively dealing with complex, high-dimensional search spaces by using ABC to conduct a global search for the ideal subset of features. The second stage applies AD_LASSO, refining the selected features by eliminating redundant features and enhancing model interpretability. The proposed ABC-ADLASSO method was compared with the AD_LASSO, LASSO, stepwise, and LARS methods under different simulation settings in high-dimensional data and various real datasets. According to the results obtained from simulations and applications on various real datasets, ABC-ADLASSO has shown significantly superior performance in terms of accuracy, precision, and overall model performance, particularly in scenarios with high correlation and a large number of features compared to the other methods evaluated. This two-stage approach offers robust feature selection and improves predictive accuracy, making it an effective tool for analyzing high-dimensional data.https://www.mdpi.com/2673-9909/4/4/81feature selectionartificial bee colonyadaptive LASSOhigh-dimensional data |
| spellingShingle | Efe Precious Onakpojeruo Nuriye Sancar A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data AppliedMath feature selection artificial bee colony adaptive LASSO high-dimensional data |
| title | A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data |
| title_full | A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data |
| title_fullStr | A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data |
| title_full_unstemmed | A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data |
| title_short | A Two-Stage Feature Selection Approach Based on Artificial Bee Colony and Adaptive LASSO in High-Dimensional Data |
| title_sort | two stage feature selection approach based on artificial bee colony and adaptive lasso in high dimensional data |
| topic | feature selection artificial bee colony adaptive LASSO high-dimensional data |
| url | https://www.mdpi.com/2673-9909/4/4/81 |
| work_keys_str_mv | AT efepreciousonakpojeruo atwostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata AT nuriyesancar atwostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata AT efepreciousonakpojeruo twostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata AT nuriyesancar twostagefeatureselectionapproachbasedonartificialbeecolonyandadaptivelassoinhighdimensionaldata |