A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI

Cervical cancer is a cancer that remains a significant global health challenge all over the world. Due to improper screening in the early stages, and healthcare disparities, a large number of women are suffering from this disease, and the mortality rate increases day by day. Hence, in these studies,...

Full description

Saved in:
Bibliographic Details
Main Authors: Rashiduzzaman Shakil, Sadia Islam, Bonna Akter
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Journal of Pathology Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2153353924000373
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850119402401300480
author Rashiduzzaman Shakil
Sadia Islam
Bonna Akter
author_facet Rashiduzzaman Shakil
Sadia Islam
Bonna Akter
author_sort Rashiduzzaman Shakil
collection DOAJ
description Cervical cancer is a cancer that remains a significant global health challenge all over the world. Due to improper screening in the early stages, and healthcare disparities, a large number of women are suffering from this disease, and the mortality rate increases day by day. Hence, in these studies, we presented a precise approach utilizing six different machine learning models (decision tree, logistic regression, naïve bayes, random forest, k nearest neighbors, support vector machine), which can predict the early stage of cervical cancer by analysing 36 risk factor attributes of 858 individuals. In addition, two data balancing techniques—Synthetic Minority Oversampling Technique and Adaptive Synthetic Sampling—were used to mitigate the data imbalance issues. Furthermore, Chi-square and Least Absolute Shrinkage and Selection Operator are two distinct feature selection processes that have been applied to evaluate the feature rank, which are mostly correlated to identify the particular disease, and also integrate an explainable artificial intelligence technique, namely Shapley Additive Explanations, for clarifying the model outcome. The applied machine learning model outcome is evaluated by performance evaluation matrices, namely accuracy, sensitivity, specificity, precision, f1-score, false-positive rate and false-negative rate, and area under the Receiver operating characteristic curve score. The decision tree outperformed in Chi-square feature selection with outstanding accuracy with 97.60%, 98.73% sensitivity, 80% specificity, and 98.73% precision, respectively. During the data imbalance, DT performed 97% accuracy, 99.35% sensitivity, 69.23% specificity, and 97.45% precision. This research is focused on developing diagnostic frameworks with automated tools to improve the detection and management of cervical cancer, as well as on helping healthcare professionals deliver more efficient and personalized care to their patients.
format Article
id doaj-art-4e7b8f18f34c48d992eb3ebab085ce24
institution OA Journals
issn 2153-3539
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Journal of Pathology Informatics
spelling doaj-art-4e7b8f18f34c48d992eb3ebab085ce242025-08-20T02:35:39ZengElsevierJournal of Pathology Informatics2153-35392024-12-011510039810.1016/j.jpi.2024.100398A precise machine learning model: Detecting cervical cancer using feature selection and explainable AIRashiduzzaman Shakil0Sadia Islam1Bonna Akter2Corresponding author.; Department of Computer Science and Engineering, Daffodil International University, Dhaka, Birulia 1216, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Dhaka, Birulia 1216, BangladeshDepartment of Computer Science and Engineering, Daffodil International University, Dhaka, Birulia 1216, BangladeshCervical cancer is a cancer that remains a significant global health challenge all over the world. Due to improper screening in the early stages, and healthcare disparities, a large number of women are suffering from this disease, and the mortality rate increases day by day. Hence, in these studies, we presented a precise approach utilizing six different machine learning models (decision tree, logistic regression, naïve bayes, random forest, k nearest neighbors, support vector machine), which can predict the early stage of cervical cancer by analysing 36 risk factor attributes of 858 individuals. In addition, two data balancing techniques—Synthetic Minority Oversampling Technique and Adaptive Synthetic Sampling—were used to mitigate the data imbalance issues. Furthermore, Chi-square and Least Absolute Shrinkage and Selection Operator are two distinct feature selection processes that have been applied to evaluate the feature rank, which are mostly correlated to identify the particular disease, and also integrate an explainable artificial intelligence technique, namely Shapley Additive Explanations, for clarifying the model outcome. The applied machine learning model outcome is evaluated by performance evaluation matrices, namely accuracy, sensitivity, specificity, precision, f1-score, false-positive rate and false-negative rate, and area under the Receiver operating characteristic curve score. The decision tree outperformed in Chi-square feature selection with outstanding accuracy with 97.60%, 98.73% sensitivity, 80% specificity, and 98.73% precision, respectively. During the data imbalance, DT performed 97% accuracy, 99.35% sensitivity, 69.23% specificity, and 97.45% precision. This research is focused on developing diagnostic frameworks with automated tools to improve the detection and management of cervical cancer, as well as on helping healthcare professionals deliver more efficient and personalized care to their patients.http://www.sciencedirect.com/science/article/pii/S2153353924000373Cervical cancerSMOTEADASYNChi-squareLASSOMachine learning
spellingShingle Rashiduzzaman Shakil
Sadia Islam
Bonna Akter
A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI
Journal of Pathology Informatics
Cervical cancer
SMOTE
ADASYN
Chi-square
LASSO
Machine learning
title A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI
title_full A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI
title_fullStr A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI
title_full_unstemmed A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI
title_short A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI
title_sort precise machine learning model detecting cervical cancer using feature selection and explainable ai
topic Cervical cancer
SMOTE
ADASYN
Chi-square
LASSO
Machine learning
url http://www.sciencedirect.com/science/article/pii/S2153353924000373
work_keys_str_mv AT rashiduzzamanshakil aprecisemachinelearningmodeldetectingcervicalcancerusingfeatureselectionandexplainableai
AT sadiaislam aprecisemachinelearningmodeldetectingcervicalcancerusingfeatureselectionandexplainableai
AT bonnaakter aprecisemachinelearningmodeldetectingcervicalcancerusingfeatureselectionandexplainableai
AT rashiduzzamanshakil precisemachinelearningmodeldetectingcervicalcancerusingfeatureselectionandexplainableai
AT sadiaislam precisemachinelearningmodeldetectingcervicalcancerusingfeatureselectionandexplainableai
AT bonnaakter precisemachinelearningmodeldetectingcervicalcancerusingfeatureselectionandexplainableai