Predicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural Areas
Abstract Background Each year, approximately 2.5 million newborns die globally, with developing countries bearing the impact of this crisis. Sub-Saharan Africa has the highest neonatal mortality rate, with Ethiopia facing alarmingly high figures, particularly in rural areas where mortality is signif...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-08-01
|
| Series: | Discover Artificial Intelligence |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44163-025-00305-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849226050284814336 |
|---|---|
| author | Melaku Alelign Mengstie Misganaw Telake Telele |
| author_facet | Melaku Alelign Mengstie Misganaw Telake Telele |
| author_sort | Melaku Alelign Mengstie |
| collection | DOAJ |
| description | Abstract Background Each year, approximately 2.5 million newborns die globally, with developing countries bearing the impact of this crisis. Sub-Saharan Africa has the highest neonatal mortality rate, with Ethiopia facing alarmingly high figures, particularly in rural areas where mortality is significantly higher due to poor healthcare access and socio-economic challenges. Methods This study aimed to develop a predictive model for neonatal mortality in rural Ethiopia using secondary data from the Ethiopian Demographic and Health Surveys (2000–2019). The dataset included 29,048 instances and 22 relevant features, which were preprocessed to handle missing values and balance the class distribution using the Synthetic Minority oversampling technique. Several ensemble machine-learning algorithms, including Random Forest, Gradient Boosting, Extreme Gradient Boosting, Light Gradient Boosting, and CatBoost, were applied to build the model. Additionally, the logistic regression algorithm was employed to enhance transparency and interpretability and for comparative analysis. Model performance was evaluated based on accuracy, precision, recall, F1 score, and Receiver Operating Characteristic—Area Under the Curve. Results Among the algorithms tested, categorical boosting achieved the highest performance with 97.5% accuracy, 97.52% precision, 97.5% recall, 97.5% F1 score, and an exceptional Receiver Operating Characteristic—Area Under the Curve value of 99.57%. Key risk factors identified include BCG vaccination status, the number of under-five children in the household, recent diarrhea episodes, and iron tablet intake during pregnancy. Valuable feedbacks from community health workers were provided on these factors, helping to refine their impact on neonatal mortality. Conclusions This study developed an effective predictive model for neonatal mortality in rural Ethiopia, providing actionable insights for targeted interventions. The model underscores the importance of improving healthcare access, maternal health, and policy reforms, with the potential to reduce neonatal mortality through mobile health apps and policymaker collaboration. |
| format | Article |
| id | doaj-art-315de4b3e56c4c8fb4053feb138c00ea |
| institution | Kabale University |
| issn | 2731-0809 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Springer |
| record_format | Article |
| series | Discover Artificial Intelligence |
| spelling | doaj-art-315de4b3e56c4c8fb4053feb138c00ea2025-08-24T11:40:00ZengSpringerDiscover Artificial Intelligence2731-08092025-08-015111810.1007/s44163-025-00305-wPredicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural AreasMelaku Alelign Mengstie0Misganaw Telake Telele1Department of Information Science, College of Informatics, University of GondarDepartment of Information Science, College of Informatics, University of GondarAbstract Background Each year, approximately 2.5 million newborns die globally, with developing countries bearing the impact of this crisis. Sub-Saharan Africa has the highest neonatal mortality rate, with Ethiopia facing alarmingly high figures, particularly in rural areas where mortality is significantly higher due to poor healthcare access and socio-economic challenges. Methods This study aimed to develop a predictive model for neonatal mortality in rural Ethiopia using secondary data from the Ethiopian Demographic and Health Surveys (2000–2019). The dataset included 29,048 instances and 22 relevant features, which were preprocessed to handle missing values and balance the class distribution using the Synthetic Minority oversampling technique. Several ensemble machine-learning algorithms, including Random Forest, Gradient Boosting, Extreme Gradient Boosting, Light Gradient Boosting, and CatBoost, were applied to build the model. Additionally, the logistic regression algorithm was employed to enhance transparency and interpretability and for comparative analysis. Model performance was evaluated based on accuracy, precision, recall, F1 score, and Receiver Operating Characteristic—Area Under the Curve. Results Among the algorithms tested, categorical boosting achieved the highest performance with 97.5% accuracy, 97.52% precision, 97.5% recall, 97.5% F1 score, and an exceptional Receiver Operating Characteristic—Area Under the Curve value of 99.57%. Key risk factors identified include BCG vaccination status, the number of under-five children in the household, recent diarrhea episodes, and iron tablet intake during pregnancy. Valuable feedbacks from community health workers were provided on these factors, helping to refine their impact on neonatal mortality. Conclusions This study developed an effective predictive model for neonatal mortality in rural Ethiopia, providing actionable insights for targeted interventions. The model underscores the importance of improving healthcare access, maternal health, and policy reforms, with the potential to reduce neonatal mortality through mobile health apps and policymaker collaboration.https://doi.org/10.1007/s44163-025-00305-wEnsemble algorithmsHealthcare inequityMaternal healthNeonatal mortalityRural health disparitiesSHAP |
| spellingShingle | Melaku Alelign Mengstie Misganaw Telake Telele Predicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural Areas Discover Artificial Intelligence Ensemble algorithms Healthcare inequity Maternal health Neonatal mortality Rural health disparities SHAP |
| title | Predicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural Areas |
| title_full | Predicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural Areas |
| title_fullStr | Predicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural Areas |
| title_full_unstemmed | Predicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural Areas |
| title_short | Predicting neonatal mortality using ensemble machine learning algorithms in the case of Ethiopian Rural Areas |
| title_sort | predicting neonatal mortality using ensemble machine learning algorithms in the case of ethiopian rural areas |
| topic | Ensemble algorithms Healthcare inequity Maternal health Neonatal mortality Rural health disparities SHAP |
| url | https://doi.org/10.1007/s44163-025-00305-w |
| work_keys_str_mv | AT melakualelignmengstie predictingneonatalmortalityusingensemblemachinelearningalgorithmsinthecaseofethiopianruralareas AT misganawtelaketelele predictingneonatalmortalityusingensemblemachinelearningalgorithmsinthecaseofethiopianruralareas |