Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset

Breast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammogr...

Full description

Saved in:
Bibliographic Details
Main Authors: Gennady Chuiko, Denys Honcharov
Format: Article
Language:English
Published: National Aerospace University «Kharkiv Aviation Institute» 2024-11-01
Series:Радіоелектронні і комп'ютерні системи
Subjects:
Online Access:http://nti.khai.edu/ojs/index.php/reks/article/view/2652
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841557725360685056
author Gennady Chuiko
Denys Honcharov
author_facet Gennady Chuiko
Denys Honcharov
author_sort Gennady Chuiko
collection DOAJ
description Breast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammography, computed tomography, histopathology, and ultrasound, are contemporary approaches for detecting and classifying breast cancer. Machine learning professionals prefer Deep Learning algorithms when analyzing substantial medical imaging data. However, the application of deep learning-based diagnostic methods in clinical practice is limited despite their potential effectiveness. Deep Learning methods are complex and opaque; however, their effectiveness can help balance these challenges. The research subjects. Deep Learning algorithms implemented in WEKA software and their efficacy on the Wisconsin Breast Cancer dataset. Objective. Significant cutback of the dataset's dimensionality without losing the predictive power. Methods. Computer experiments in the WEKA medium provide preprocessing, supervised, and unsupervised Deep Learning for full and reduced datasets with estimations of their efficacy. Results. Triple sequential filtering notably reduced the dimensionality of the initial dataset: from 30 attributes up to four. Unexpectedly, all three Deep Learning classifiers implemented in WEKA (Dl4jMlp, Multilayer Perceptron, and Voted Perceptron) showed the statistically same performance. In addition, the performance was statistically the same for full and reduced datasets. For example, the percentage of correctly classified instances was in range (95.9-97.7) with a standard deviation of less than 2.5 %. Two clustering algorithms that use neurons (Self Organized Map, SOM, and Learning Vector Quantization, LVQ) have also shown similar results. The two clusters in all datasets are not well separated, but they accurately represent both preassigned classes, with the Fowlkes–Mallow indexes (FMI) ranging from 0.81 to 0.99. Conclusion. The results indicate that the dimensionality of the Wisconsin Breast Cancer dataset, which is increasingly becoming the "gold standard" for diagnosing Malignant-Benign tumors, can be significantly reduced without losing predictive power. The Deep Learning algorithms in WEKA deliver excellent performance for both supervised and unsupervised learning, regardless of whether dealing with full or reduced datasets.
format Article
id doaj-art-7ee604b90c644f749beef15b9dbffb81
institution Kabale University
issn 1814-4225
2663-2012
language English
publishDate 2024-11-01
publisher National Aerospace University «Kharkiv Aviation Institute»
record_format Article
series Радіоелектронні і комп'ютерні системи
spelling doaj-art-7ee604b90c644f749beef15b9dbffb812025-01-06T10:47:18ZengNational Aerospace University «Kharkiv Aviation Institute»Радіоелектронні і комп'ютерні системи1814-42252663-20122024-11-0120244919810.32620/reks.2024.4.082357Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic datasetGennady Chuiko0Denys Honcharov1Petro Mohyla Black Sea National University, MykolaivPetro Mohyla Black Sea National University, MykolaivBreast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammography, computed tomography, histopathology, and ultrasound, are contemporary approaches for detecting and classifying breast cancer. Machine learning professionals prefer Deep Learning algorithms when analyzing substantial medical imaging data. However, the application of deep learning-based diagnostic methods in clinical practice is limited despite their potential effectiveness. Deep Learning methods are complex and opaque; however, their effectiveness can help balance these challenges. The research subjects. Deep Learning algorithms implemented in WEKA software and their efficacy on the Wisconsin Breast Cancer dataset. Objective. Significant cutback of the dataset's dimensionality without losing the predictive power. Methods. Computer experiments in the WEKA medium provide preprocessing, supervised, and unsupervised Deep Learning for full and reduced datasets with estimations of their efficacy. Results. Triple sequential filtering notably reduced the dimensionality of the initial dataset: from 30 attributes up to four. Unexpectedly, all three Deep Learning classifiers implemented in WEKA (Dl4jMlp, Multilayer Perceptron, and Voted Perceptron) showed the statistically same performance. In addition, the performance was statistically the same for full and reduced datasets. For example, the percentage of correctly classified instances was in range (95.9-97.7) with a standard deviation of less than 2.5 %. Two clustering algorithms that use neurons (Self Organized Map, SOM, and Learning Vector Quantization, LVQ) have also shown similar results. The two clusters in all datasets are not well separated, but they accurately represent both preassigned classes, with the Fowlkes–Mallow indexes (FMI) ranging from 0.81 to 0.99. Conclusion. The results indicate that the dimensionality of the Wisconsin Breast Cancer dataset, which is increasingly becoming the "gold standard" for diagnosing Malignant-Benign tumors, can be significantly reduced without losing predictive power. The Deep Learning algorithms in WEKA deliver excellent performance for both supervised and unsupervised learning, regardless of whether dealing with full or reduced datasets.http://nti.khai.edu/ojs/index.php/reks/article/view/2652breast cancerdeep learning algorithmswekawisconsin breast cancer datasetdiagnosing malignant-benign tumors
spellingShingle Gennady Chuiko
Denys Honcharov
Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
Радіоелектронні і комп'ютерні системи
breast cancer
deep learning algorithms
weka
wisconsin breast cancer dataset
diagnosing malignant-benign tumors
title Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_full Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_fullStr Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_full_unstemmed Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_short Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
title_sort dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
topic breast cancer
deep learning algorithms
weka
wisconsin breast cancer dataset
diagnosing malignant-benign tumors
url http://nti.khai.edu/ojs/index.php/reks/article/view/2652
work_keys_str_mv AT gennadychuiko dimensionalitycutbackanddeeplearningalgorithmsefficacyastothebreastcancerdiagnosticdataset
AT denyshoncharov dimensionalitycutbackanddeeplearningalgorithmsefficacyastothebreastcancerdiagnosticdataset