Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset

Breast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammogr...

Full description

Saved in:

Bibliographic Details
Main Authors:	Gennady Chuiko, Denys Honcharov
Format:	Article
Language:	English
Published:	National Aerospace University «Kharkiv Aviation Institute» 2024-11-01
Series:	Радіоелектронні і комп'ютерні системи
Subjects:	breast cancer deep learning algorithms weka wisconsin breast cancer dataset diagnosing malignant-benign tumors
Online Access:	http://nti.khai.edu/ojs/index.php/reks/article/view/2652
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841557725360685056
author	Gennady Chuiko Denys Honcharov
author_facet	Gennady Chuiko Denys Honcharov
author_sort	Gennady Chuiko
collection	DOAJ
description	Breast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammography, computed tomography, histopathology, and ultrasound, are contemporary approaches for detecting and classifying breast cancer. Machine learning professionals prefer Deep Learning algorithms when analyzing substantial medical imaging data. However, the application of deep learning-based diagnostic methods in clinical practice is limited despite their potential effectiveness. Deep Learning methods are complex and opaque; however, their effectiveness can help balance these challenges. The research subjects. Deep Learning algorithms implemented in WEKA software and their efficacy on the Wisconsin Breast Cancer dataset. Objective. Significant cutback of the dataset's dimensionality without losing the predictive power. Methods. Computer experiments in the WEKA medium provide preprocessing, supervised, and unsupervised Deep Learning for full and reduced datasets with estimations of their efficacy. Results. Triple sequential filtering notably reduced the dimensionality of the initial dataset: from 30 attributes up to four. Unexpectedly, all three Deep Learning classifiers implemented in WEKA (Dl4jMlp, Multilayer Perceptron, and Voted Perceptron) showed the statistically same performance. In addition, the performance was statistically the same for full and reduced datasets. For example, the percentage of correctly classified instances was in range (95.9-97.7) with a standard deviation of less than 2.5 %. Two clustering algorithms that use neurons (Self Organized Map, SOM, and Learning Vector Quantization, LVQ) have also shown similar results. The two clusters in all datasets are not well separated, but they accurately represent both preassigned classes, with the Fowlkes–Mallow indexes (FMI) ranging from 0.81 to 0.99. Conclusion. The results indicate that the dimensionality of the Wisconsin Breast Cancer dataset, which is increasingly becoming the "gold standard" for diagnosing Malignant-Benign tumors, can be significantly reduced without losing predictive power. The Deep Learning algorithms in WEKA deliver excellent performance for both supervised and unsupervised learning, regardless of whether dealing with full or reduced datasets.
format	Article
id	doaj-art-7ee604b90c644f749beef15b9dbffb81
institution	Kabale University
issn	1814-4225 2663-2012
language	English
publishDate	2024-11-01
publisher	National Aerospace University «Kharkiv Aviation Institute»
record_format	Article
series	Радіоелектронні і комп'ютерні системи
spelling	doaj-art-7ee604b90c644f749beef15b9dbffb812025-01-06T10:47:18ZengNational Aerospace University «Kharkiv Aviation Institute»Радіоелектронні і комп'ютерні системи1814-42252663-20122024-11-0120244919810.32620/reks.2024.4.082357Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic datasetGennady Chuiko0Denys Honcharov1Petro Mohyla Black Sea National University, MykolaivPetro Mohyla Black Sea National University, MykolaivBreast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammography, computed tomography, histopathology, and ultrasound, are contemporary approaches for detecting and classifying breast cancer. Machine learning professionals prefer Deep Learning algorithms when analyzing substantial medical imaging data. However, the application of deep learning-based diagnostic methods in clinical practice is limited despite their potential effectiveness. Deep Learning methods are complex and opaque; however, their effectiveness can help balance these challenges. The research subjects. Deep Learning algorithms implemented in WEKA software and their efficacy on the Wisconsin Breast Cancer dataset. Objective. Significant cutback of the dataset's dimensionality without losing the predictive power. Methods. Computer experiments in the WEKA medium provide preprocessing, supervised, and unsupervised Deep Learning for full and reduced datasets with estimations of their efficacy. Results. Triple sequential filtering notably reduced the dimensionality of the initial dataset: from 30 attributes up to four. Unexpectedly, all three Deep Learning classifiers implemented in WEKA (Dl4jMlp, Multilayer Perceptron, and Voted Perceptron) showed the statistically same performance. In addition, the performance was statistically the same for full and reduced datasets. For example, the percentage of correctly classified instances was in range (95.9-97.7) with a standard deviation of less than 2.5 %. Two clustering algorithms that use neurons (Self Organized Map, SOM, and Learning Vector Quantization, LVQ) have also shown similar results. The two clusters in all datasets are not well separated, but they accurately represent both preassigned classes, with the Fowlkes–Mallow indexes (FMI) ranging from 0.81 to 0.99. Conclusion. The results indicate that the dimensionality of the Wisconsin Breast Cancer dataset, which is increasingly becoming the "gold standard" for diagnosing Malignant-Benign tumors, can be significantly reduced without losing predictive power. The Deep Learning algorithms in WEKA deliver excellent performance for both supervised and unsupervised learning, regardless of whether dealing with full or reduced datasets.http://nti.khai.edu/ojs/index.php/reks/article/view/2652breast cancerdeep learning algorithmswekawisconsin breast cancer datasetdiagnosing malignant-benign tumors
spellingShingle	Gennady Chuiko Denys Honcharov Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset Радіоелектронні і комп'ютерні системи breast cancer deep learning algorithms weka wisconsin breast cancer dataset diagnosing malignant-benign tumors
title	Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_full	Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_fullStr	Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_full_unstemmed	Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_short	Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_sort	dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
topic	breast cancer deep learning algorithms weka wisconsin breast cancer dataset diagnosing malignant-benign tumors
url	http://nti.khai.edu/ojs/index.php/reks/article/view/2652
work_keys_str_mv	AT gennadychuiko dimensionalitycutbackanddeeplearningalgorithmsefficacyastothebreastcancerdiagnosticdataset AT denyshoncharov dimensionalitycutbackanddeeplearningalgorithmsefficacyastothebreastcancerdiagnosticdataset

Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset

Similar Items