Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset
Breast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammogr...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
National Aerospace University «Kharkiv Aviation Institute»
2024-11-01
|
Series: | Радіоелектронні і комп'ютерні системи |
Subjects: | |
Online Access: | http://nti.khai.edu/ojs/index.php/reks/article/view/2652 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841557725360685056 |
---|---|
author | Gennady Chuiko Denys Honcharov |
author_facet | Gennady Chuiko Denys Honcharov |
author_sort | Gennady Chuiko |
collection | DOAJ |
description | Breast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammography, computed tomography, histopathology, and ultrasound, are contemporary approaches for detecting and classifying breast cancer. Machine learning professionals prefer Deep Learning algorithms when analyzing substantial medical imaging data. However, the application of deep learning-based diagnostic methods in clinical practice is limited despite their potential effectiveness. Deep Learning methods are complex and opaque; however, their effectiveness can help balance these challenges. The research subjects. Deep Learning algorithms implemented in WEKA software and their efficacy on the Wisconsin Breast Cancer dataset. Objective. Significant cutback of the dataset's dimensionality without losing the predictive power. Methods. Computer experiments in the WEKA medium provide preprocessing, supervised, and unsupervised Deep Learning for full and reduced datasets with estimations of their efficacy. Results. Triple sequential filtering notably reduced the dimensionality of the initial dataset: from 30 attributes up to four. Unexpectedly, all three Deep Learning classifiers implemented in WEKA (Dl4jMlp, Multilayer Perceptron, and Voted Perceptron) showed the statistically same performance. In addition, the performance was statistically the same for full and reduced datasets. For example, the percentage of correctly classified instances was in range (95.9-97.7) with a standard deviation of less than 2.5 %. Two clustering algorithms that use neurons (Self Organized Map, SOM, and Learning Vector Quantization, LVQ) have also shown similar results. The two clusters in all datasets are not well separated, but they accurately represent both preassigned classes, with the Fowlkes–Mallow indexes (FMI) ranging from 0.81 to 0.99. Conclusion. The results indicate that the dimensionality of the Wisconsin Breast Cancer dataset, which is increasingly becoming the "gold standard" for diagnosing Malignant-Benign tumors, can be significantly reduced without losing predictive power. The Deep Learning algorithms in WEKA deliver excellent performance for both supervised and unsupervised learning, regardless of whether dealing with full or reduced datasets. |
format | Article |
id | doaj-art-7ee604b90c644f749beef15b9dbffb81 |
institution | Kabale University |
issn | 1814-4225 2663-2012 |
language | English |
publishDate | 2024-11-01 |
publisher | National Aerospace University «Kharkiv Aviation Institute» |
record_format | Article |
series | Радіоелектронні і комп'ютерні системи |
spelling | doaj-art-7ee604b90c644f749beef15b9dbffb812025-01-06T10:47:18ZengNational Aerospace University «Kharkiv Aviation Institute»Радіоелектронні і комп'ютерні системи1814-42252663-20122024-11-0120244919810.32620/reks.2024.4.082357Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic datasetGennady Chuiko0Denys Honcharov1Petro Mohyla Black Sea National University, MykolaivPetro Mohyla Black Sea National University, MykolaivBreast cancer is a significant threat because it is the most frequently diagnosed form of cancer and one of the leading causes of mortality among women. Early diagnosis and timely treatment are crucial for saving lives and reducing treatment costs. Various medical imaging techniques, such as mammography, computed tomography, histopathology, and ultrasound, are contemporary approaches for detecting and classifying breast cancer. Machine learning professionals prefer Deep Learning algorithms when analyzing substantial medical imaging data. However, the application of deep learning-based diagnostic methods in clinical practice is limited despite their potential effectiveness. Deep Learning methods are complex and opaque; however, their effectiveness can help balance these challenges. The research subjects. Deep Learning algorithms implemented in WEKA software and their efficacy on the Wisconsin Breast Cancer dataset. Objective. Significant cutback of the dataset's dimensionality without losing the predictive power. Methods. Computer experiments in the WEKA medium provide preprocessing, supervised, and unsupervised Deep Learning for full and reduced datasets with estimations of their efficacy. Results. Triple sequential filtering notably reduced the dimensionality of the initial dataset: from 30 attributes up to four. Unexpectedly, all three Deep Learning classifiers implemented in WEKA (Dl4jMlp, Multilayer Perceptron, and Voted Perceptron) showed the statistically same performance. In addition, the performance was statistically the same for full and reduced datasets. For example, the percentage of correctly classified instances was in range (95.9-97.7) with a standard deviation of less than 2.5 %. Two clustering algorithms that use neurons (Self Organized Map, SOM, and Learning Vector Quantization, LVQ) have also shown similar results. The two clusters in all datasets are not well separated, but they accurately represent both preassigned classes, with the Fowlkes–Mallow indexes (FMI) ranging from 0.81 to 0.99. Conclusion. The results indicate that the dimensionality of the Wisconsin Breast Cancer dataset, which is increasingly becoming the "gold standard" for diagnosing Malignant-Benign tumors, can be significantly reduced without losing predictive power. The Deep Learning algorithms in WEKA deliver excellent performance for both supervised and unsupervised learning, regardless of whether dealing with full or reduced datasets.http://nti.khai.edu/ojs/index.php/reks/article/view/2652breast cancerdeep learning algorithmswekawisconsin breast cancer datasetdiagnosing malignant-benign tumors |
spellingShingle | Gennady Chuiko Denys Honcharov Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset Радіоелектронні і комп'ютерні системи breast cancer deep learning algorithms weka wisconsin breast cancer dataset diagnosing malignant-benign tumors |
title | Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset |
title_full | Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset |
title_fullStr | Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset |
title_full_unstemmed | Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset |
title_short | Dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset |
title_sort | dimensionality cutback and deep learning algorithms efficacy as to the breast cancer diagnostic dataset |
topic | breast cancer deep learning algorithms weka wisconsin breast cancer dataset diagnosing malignant-benign tumors |
url | http://nti.khai.edu/ojs/index.php/reks/article/view/2652 |
work_keys_str_mv | AT gennadychuiko dimensionalitycutbackanddeeplearningalgorithmsefficacyastothebreastcancerdiagnosticdataset AT denyshoncharov dimensionalitycutbackanddeeplearningalgorithmsefficacyastothebreastcancerdiagnosticdataset |