A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets

Abstract This study explores the performance of deep learning models, specifically Convolutional Neural Networks (CNN) and XGBoost, in predicting alpha and beta thalassemia using both public and private datasets. Thalassemia is a genetic disorder that impairs hemoglobin production, leading to anemia...

Full description

Saved in:
Bibliographic Details
Main Authors: Muhammad Umar Nasir, Muhammad Tahir Naseem, Taher M. Ghazal, Muhammad Zubair, Oualid Ali, Sagheer Abbas, Munir Ahmad, Khan Muhammad Adnan
Format: Article
Language:English
Published: Nature Portfolio 2025-04-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-97353-0
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850146272887963648
author Muhammad Umar Nasir
Muhammad Tahir Naseem
Taher M. Ghazal
Muhammad Zubair
Oualid Ali
Sagheer Abbas
Munir Ahmad
Khan Muhammad Adnan
author_facet Muhammad Umar Nasir
Muhammad Tahir Naseem
Taher M. Ghazal
Muhammad Zubair
Oualid Ali
Sagheer Abbas
Munir Ahmad
Khan Muhammad Adnan
author_sort Muhammad Umar Nasir
collection DOAJ
description Abstract This study explores the performance of deep learning models, specifically Convolutional Neural Networks (CNN) and XGBoost, in predicting alpha and beta thalassemia using both public and private datasets. Thalassemia is a genetic disorder that impairs hemoglobin production, leading to anemia and other health complications. Early diagnosis is essential for effective management and prevention of severe health issues. The study applied CNN and XGBoost to two case studies: one for alpha-thalassemia and the other for beta-thalassemia. Public datasets were sourced from medical databases, while private datasets were collected from clinical records, offering a more comprehensive feature set and larger sample sizes. After data preprocessing and splitting, model performance was evaluated. XGBoost achieved 99.34% accuracy on the private dataset for alpha thalassemia, while CNN reached 98.10% accuracy on the private dataset for beta-thalassemia. The superior performance on private datasets was attributed to better data quality and volume. This study highlights the effectiveness of deep learning in medical diagnostics, demonstrating that high-quality data can significantly enhance the predictive capabilities of AI models. By integrating CNN and XGBoost, this approach offers a robust method for detecting thalassemia, potentially improving early diagnosis and reducing disease-related mortality.
format Article
id doaj-art-39c6ecc8878b4d168b15c582df6de0ea
institution OA Journals
issn 2045-2322
language English
publishDate 2025-04-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-39c6ecc8878b4d168b15c582df6de0ea2025-08-20T02:27:53ZengNature PortfolioScientific Reports2045-23222025-04-0115111610.1038/s41598-025-97353-0A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasetsMuhammad Umar Nasir0Muhammad Tahir Naseem1Taher M. Ghazal2Muhammad Zubair3Oualid Ali4Sagheer Abbas5Munir Ahmad6Khan Muhammad Adnan7School of Computing, IVY CMSDepartment of Electronic Engineering, Yeungnam UniversityDepartment of Networks and Cybersecurity, Hourani Center for Applied Scientific Research, Al- Ahliyya Amman UniversityDepartment of Computer Science, Faculty of Computing, Riphah International UniversityCollege of Arts & Science, Applied Science UniversityDepartment of Computer Science, Prince Mohammad Bin Fahd UniversityDepartment of Computer Science, National College of Business Administration and EconomicsDepartment of Software, Faculty of Artificial Intelligence and Software, Gachon UniversityAbstract This study explores the performance of deep learning models, specifically Convolutional Neural Networks (CNN) and XGBoost, in predicting alpha and beta thalassemia using both public and private datasets. Thalassemia is a genetic disorder that impairs hemoglobin production, leading to anemia and other health complications. Early diagnosis is essential for effective management and prevention of severe health issues. The study applied CNN and XGBoost to two case studies: one for alpha-thalassemia and the other for beta-thalassemia. Public datasets were sourced from medical databases, while private datasets were collected from clinical records, offering a more comprehensive feature set and larger sample sizes. After data preprocessing and splitting, model performance was evaluated. XGBoost achieved 99.34% accuracy on the private dataset for alpha thalassemia, while CNN reached 98.10% accuracy on the private dataset for beta-thalassemia. The superior performance on private datasets was attributed to better data quality and volume. This study highlights the effectiveness of deep learning in medical diagnostics, demonstrating that high-quality data can significantly enhance the predictive capabilities of AI models. By integrating CNN and XGBoost, this approach offers a robust method for detecting thalassemia, potentially improving early diagnosis and reducing disease-related mortality.https://doi.org/10.1038/s41598-025-97353-0ThalassemiaDeep learningCNNXGBoostCase studyAlpha thalassemia
spellingShingle Muhammad Umar Nasir
Muhammad Tahir Naseem
Taher M. Ghazal
Muhammad Zubair
Oualid Ali
Sagheer Abbas
Munir Ahmad
Khan Muhammad Adnan
A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
Scientific Reports
Thalassemia
Deep learning
CNN
XGBoost
Case study
Alpha thalassemia
title A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
title_full A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
title_fullStr A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
title_full_unstemmed A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
title_short A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
title_sort comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
topic Thalassemia
Deep learning
CNN
XGBoost
Case study
Alpha thalassemia
url https://doi.org/10.1038/s41598-025-97353-0
work_keys_str_mv AT muhammadumarnasir acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT muhammadtahirnaseem acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT tahermghazal acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT muhammadzubair acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT oualidali acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT sagheerabbas acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT munirahmad acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT khanmuhammadadnan acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT muhammadumarnasir comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT muhammadtahirnaseem comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT tahermghazal comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT muhammadzubair comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT oualidali comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT sagheerabbas comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT munirahmad comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets
AT khanmuhammadadnan comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets