A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning
Abstract Effective Breast cancer (BC) analysis is crucial for early prognosis, controlling cancer recurrence, timely medical intervention, and determining appropriate treatment procedures. Additionally, it plays a significant role in optimizing mortality rates among women with breast cancer and incr...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-02-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-025-87826-7 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1823862285192396800 |
---|---|
author | E. Sreehari L. D. Dhinesh Babu |
author_facet | E. Sreehari L. D. Dhinesh Babu |
author_sort | E. Sreehari |
collection | DOAJ |
description | Abstract Effective Breast cancer (BC) analysis is crucial for early prognosis, controlling cancer recurrence, timely medical intervention, and determining appropriate treatment procedures. Additionally, it plays a significant role in optimizing mortality rates among women with breast cancer and increasing the average lifespan of patients. This can be achieved by performing effective critical feature analysis of the BC by picking superlative features through significant ranking-based Feature Selection (FS). Various authors have developed strategies relying on single FS, but this approach may not yield excellent results and could lead to various consequences, including time and storage complexity issues, inaccurate results, poor decision-making, and difficult interpretation of models. Therefore, critical data analysis can facilitate the development of a robust ranking methodology for effective feature selection. To solve these problems, this paper suggests a new method called Aggregated Coefficient Ranking-based Feature Selection (ACRFS), which is based on tri chracteristic behavioral criteria. This strategy aims to significantly improve the ranking for an effective Attribute Subset Selection (ASSS). The proposed method utilized computational problem solvers such as chi-square, mutual information, correlation, and rank-dense methods. The work implemented the introduced methodology using Wisconsin-based breast cancer data and applied the Synthetic Minority Oversampling Technique (SMOTE) to the obtained data subset. Later, we employed models such as decision trees, support vector machines, k-nearest neighbors, random forests, stochastic gradient descent, and Gaussian naive bayes to determine the type of cancer. The classification metrics such as accuracy, precision, recall, F1 score, kappa score, and Matthews coefficient were utilized to evaluate the effectiveness of the suggested ACRFS approach. The proposed method has demonstrated superior outcomes with fewer features and a minimal time complexity. |
format | Article |
id | doaj-art-251c0ba9aa6b4eef805d571122f76b02 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-02-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-251c0ba9aa6b4eef805d571122f76b022025-02-09T12:35:32ZengNature PortfolioScientific Reports2045-23222025-02-0115111710.1038/s41598-025-87826-7A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learningE. Sreehari0L. D. Dhinesh Babu1School of Computer Science Engineering and Information Systems, Vellore Institute of TechnologySchool of Computer Science Engineering and Information Systems, Vellore Institute of TechnologyAbstract Effective Breast cancer (BC) analysis is crucial for early prognosis, controlling cancer recurrence, timely medical intervention, and determining appropriate treatment procedures. Additionally, it plays a significant role in optimizing mortality rates among women with breast cancer and increasing the average lifespan of patients. This can be achieved by performing effective critical feature analysis of the BC by picking superlative features through significant ranking-based Feature Selection (FS). Various authors have developed strategies relying on single FS, but this approach may not yield excellent results and could lead to various consequences, including time and storage complexity issues, inaccurate results, poor decision-making, and difficult interpretation of models. Therefore, critical data analysis can facilitate the development of a robust ranking methodology for effective feature selection. To solve these problems, this paper suggests a new method called Aggregated Coefficient Ranking-based Feature Selection (ACRFS), which is based on tri chracteristic behavioral criteria. This strategy aims to significantly improve the ranking for an effective Attribute Subset Selection (ASSS). The proposed method utilized computational problem solvers such as chi-square, mutual information, correlation, and rank-dense methods. The work implemented the introduced methodology using Wisconsin-based breast cancer data and applied the Synthetic Minority Oversampling Technique (SMOTE) to the obtained data subset. Later, we employed models such as decision trees, support vector machines, k-nearest neighbors, random forests, stochastic gradient descent, and Gaussian naive bayes to determine the type of cancer. The classification metrics such as accuracy, precision, recall, F1 score, kappa score, and Matthews coefficient were utilized to evaluate the effectiveness of the suggested ACRFS approach. The proposed method has demonstrated superior outcomes with fewer features and a minimal time complexity.https://doi.org/10.1038/s41598-025-87826-7 |
spellingShingle | E. Sreehari L. D. Dhinesh Babu A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning Scientific Reports |
title | A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning |
title_full | A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning |
title_fullStr | A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning |
title_full_unstemmed | A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning |
title_short | A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning |
title_sort | novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning |
url | https://doi.org/10.1038/s41598-025-87826-7 |
work_keys_str_mv | AT esreehari anovelaggregatedcoefficientrankingbasedfeatureselectionstrategyforenhancingthediagnosisofbreastcancerclassificationusingmachinelearning AT lddhineshbabu anovelaggregatedcoefficientrankingbasedfeatureselectionstrategyforenhancingthediagnosisofbreastcancerclassificationusingmachinelearning AT esreehari novelaggregatedcoefficientrankingbasedfeatureselectionstrategyforenhancingthediagnosisofbreastcancerclassificationusingmachinelearning AT lddhineshbabu novelaggregatedcoefficientrankingbasedfeatureselectionstrategyforenhancingthediagnosisofbreastcancerclassificationusingmachinelearning |