A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets
Abstract This study explores the performance of deep learning models, specifically Convolutional Neural Networks (CNN) and XGBoost, in predicting alpha and beta thalassemia using both public and private datasets. Thalassemia is a genetic disorder that impairs hemoglobin production, leading to anemia...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-97353-0 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850146272887963648 |
|---|---|
| author | Muhammad Umar Nasir Muhammad Tahir Naseem Taher M. Ghazal Muhammad Zubair Oualid Ali Sagheer Abbas Munir Ahmad Khan Muhammad Adnan |
| author_facet | Muhammad Umar Nasir Muhammad Tahir Naseem Taher M. Ghazal Muhammad Zubair Oualid Ali Sagheer Abbas Munir Ahmad Khan Muhammad Adnan |
| author_sort | Muhammad Umar Nasir |
| collection | DOAJ |
| description | Abstract This study explores the performance of deep learning models, specifically Convolutional Neural Networks (CNN) and XGBoost, in predicting alpha and beta thalassemia using both public and private datasets. Thalassemia is a genetic disorder that impairs hemoglobin production, leading to anemia and other health complications. Early diagnosis is essential for effective management and prevention of severe health issues. The study applied CNN and XGBoost to two case studies: one for alpha-thalassemia and the other for beta-thalassemia. Public datasets were sourced from medical databases, while private datasets were collected from clinical records, offering a more comprehensive feature set and larger sample sizes. After data preprocessing and splitting, model performance was evaluated. XGBoost achieved 99.34% accuracy on the private dataset for alpha thalassemia, while CNN reached 98.10% accuracy on the private dataset for beta-thalassemia. The superior performance on private datasets was attributed to better data quality and volume. This study highlights the effectiveness of deep learning in medical diagnostics, demonstrating that high-quality data can significantly enhance the predictive capabilities of AI models. By integrating CNN and XGBoost, this approach offers a robust method for detecting thalassemia, potentially improving early diagnosis and reducing disease-related mortality. |
| format | Article |
| id | doaj-art-39c6ecc8878b4d168b15c582df6de0ea |
| institution | OA Journals |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-39c6ecc8878b4d168b15c582df6de0ea2025-08-20T02:27:53ZengNature PortfolioScientific Reports2045-23222025-04-0115111610.1038/s41598-025-97353-0A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasetsMuhammad Umar Nasir0Muhammad Tahir Naseem1Taher M. Ghazal2Muhammad Zubair3Oualid Ali4Sagheer Abbas5Munir Ahmad6Khan Muhammad Adnan7School of Computing, IVY CMSDepartment of Electronic Engineering, Yeungnam UniversityDepartment of Networks and Cybersecurity, Hourani Center for Applied Scientific Research, Al- Ahliyya Amman UniversityDepartment of Computer Science, Faculty of Computing, Riphah International UniversityCollege of Arts & Science, Applied Science UniversityDepartment of Computer Science, Prince Mohammad Bin Fahd UniversityDepartment of Computer Science, National College of Business Administration and EconomicsDepartment of Software, Faculty of Artificial Intelligence and Software, Gachon UniversityAbstract This study explores the performance of deep learning models, specifically Convolutional Neural Networks (CNN) and XGBoost, in predicting alpha and beta thalassemia using both public and private datasets. Thalassemia is a genetic disorder that impairs hemoglobin production, leading to anemia and other health complications. Early diagnosis is essential for effective management and prevention of severe health issues. The study applied CNN and XGBoost to two case studies: one for alpha-thalassemia and the other for beta-thalassemia. Public datasets were sourced from medical databases, while private datasets were collected from clinical records, offering a more comprehensive feature set and larger sample sizes. After data preprocessing and splitting, model performance was evaluated. XGBoost achieved 99.34% accuracy on the private dataset for alpha thalassemia, while CNN reached 98.10% accuracy on the private dataset for beta-thalassemia. The superior performance on private datasets was attributed to better data quality and volume. This study highlights the effectiveness of deep learning in medical diagnostics, demonstrating that high-quality data can significantly enhance the predictive capabilities of AI models. By integrating CNN and XGBoost, this approach offers a robust method for detecting thalassemia, potentially improving early diagnosis and reducing disease-related mortality.https://doi.org/10.1038/s41598-025-97353-0ThalassemiaDeep learningCNNXGBoostCase studyAlpha thalassemia |
| spellingShingle | Muhammad Umar Nasir Muhammad Tahir Naseem Taher M. Ghazal Muhammad Zubair Oualid Ali Sagheer Abbas Munir Ahmad Khan Muhammad Adnan A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets Scientific Reports Thalassemia Deep learning CNN XGBoost Case study Alpha thalassemia |
| title | A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets |
| title_full | A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets |
| title_fullStr | A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets |
| title_full_unstemmed | A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets |
| title_short | A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets |
| title_sort | comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets |
| topic | Thalassemia Deep learning CNN XGBoost Case study Alpha thalassemia |
| url | https://doi.org/10.1038/s41598-025-97353-0 |
| work_keys_str_mv | AT muhammadumarnasir acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT muhammadtahirnaseem acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT tahermghazal acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT muhammadzubair acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT oualidali acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT sagheerabbas acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT munirahmad acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT khanmuhammadadnan acomprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT muhammadumarnasir comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT muhammadtahirnaseem comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT tahermghazal comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT muhammadzubair comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT oualidali comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT sagheerabbas comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT munirahmad comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets AT khanmuhammadadnan comprehensivecasestudyofdeeplearningonthedetectionofalphathalassemiaandbetathalassemiausingpublicandprivatedatasets |