A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data

Early breast cancer diagnosis is crucial for improving treatment outcomes for women. Addressing class imbalance in breast cancer data is essential for enhancing detection accuracy, yet traditional machine learning methods often overlook this imbalance, limiting their classification performance. To t...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhenzhen Wang, Junde Xie, Jia Zhang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10794777/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849220117790982144
author Zhenzhen Wang
Junde Xie
Jia Zhang
author_facet Zhenzhen Wang
Junde Xie
Jia Zhang
author_sort Zhenzhen Wang
collection DOAJ
description Early breast cancer diagnosis is crucial for improving treatment outcomes for women. Addressing class imbalance in breast cancer data is essential for enhancing detection accuracy, yet traditional machine learning methods often overlook this imbalance, limiting their classification performance. To tackle this issue, we propose a robust enhanced ensemble learning method (REEL). Specifically, a double-level over-sampling technology is developed to increase the diversity of synthesized minority breast cancer samples before model training, and an improved Random Forest is proposed to reconcile the bias and variance. In addition, a data-driven based particle swarm optimization algorithm automatically is used to select the value of parameters for base classifiers. Experimental results on breast cancer datasets and 19 other imbalanced datasets validate that our method outperforms other algorithms in terms of accuracy, F1 score, and AUC.These findings confirm that our method can further improve classification accuracy and has significant application value in the diagnosis of breast cancer.
format Article
id doaj-art-e886a9adebac4cea8d45576bc122f844
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-e886a9adebac4cea8d45576bc122f8442024-12-20T00:00:43ZengIEEEIEEE Access2169-35362024-01-011218977618978810.1109/ACCESS.2024.351637610794777A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced DataZhenzhen Wang0https://orcid.org/0009-0006-6930-2834Junde Xie1https://orcid.org/0009-0008-8697-7272Jia Zhang2https://orcid.org/0000-0001-5740-715XSchool of Chemistry and Chemical Engineering, Central South University of Forestry and Technology, Changsha, ChinaHunan Provincial Key Laboratory of Geochemical Processes and Resource Environmental Effects, Geophysical and Geochemical Survey Institute of Hunan, Changsha, ChinaSchool of Computer and Mathematics, Central South University of Forestry and Technology, Changsha, ChinaEarly breast cancer diagnosis is crucial for improving treatment outcomes for women. Addressing class imbalance in breast cancer data is essential for enhancing detection accuracy, yet traditional machine learning methods often overlook this imbalance, limiting their classification performance. To tackle this issue, we propose a robust enhanced ensemble learning method (REEL). Specifically, a double-level over-sampling technology is developed to increase the diversity of synthesized minority breast cancer samples before model training, and an improved Random Forest is proposed to reconcile the bias and variance. In addition, a data-driven based particle swarm optimization algorithm automatically is used to select the value of parameters for base classifiers. Experimental results on breast cancer datasets and 19 other imbalanced datasets validate that our method outperforms other algorithms in terms of accuracy, F1 score, and AUC.These findings confirm that our method can further improve classification accuracy and has significant application value in the diagnosis of breast cancer.https://ieeexplore.ieee.org/document/10794777/Breast cancer diagnosisimbalanced data classificationdouble-layer oversamplingrandom forest
spellingShingle Zhenzhen Wang
Junde Xie
Jia Zhang
A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data
IEEE Access
Breast cancer diagnosis
imbalanced data classification
double-layer oversampling
random forest
title A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data
title_full A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data
title_fullStr A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data
title_full_unstemmed A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data
title_short A Robust Enhanced Ensemble Learning Method for Breast Cancer Data Diagnosis on Imbalanced Data
title_sort robust enhanced ensemble learning method for breast cancer data diagnosis on imbalanced data
topic Breast cancer diagnosis
imbalanced data classification
double-layer oversampling
random forest
url https://ieeexplore.ieee.org/document/10794777/
work_keys_str_mv AT zhenzhenwang arobustenhancedensemblelearningmethodforbreastcancerdatadiagnosisonimbalanceddata
AT jundexie arobustenhancedensemblelearningmethodforbreastcancerdatadiagnosisonimbalanceddata
AT jiazhang arobustenhancedensemblelearningmethodforbreastcancerdatadiagnosisonimbalanceddata
AT zhenzhenwang robustenhancedensemblelearningmethodforbreastcancerdatadiagnosisonimbalanceddata
AT jundexie robustenhancedensemblelearningmethodforbreastcancerdatadiagnosisonimbalanceddata
AT jiazhang robustenhancedensemblelearningmethodforbreastcancerdatadiagnosisonimbalanceddata