Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approach

Abstract Anaemia affects approximately one-third of women of reproductive age globally, with the highest burden observed in resource-limited countries. Therefore, this study aimed to determine the socioeconomic and demographic factors associated with anaemia and predict anaemia among women in Zimbab...

Full description

Saved in:
Bibliographic Details
Main Authors: Garikayi Chemhaka, Elliot Mbunge, Tafadzwa Dzinamarira, Godfrey Musuka, John Batani, Benhildah Muchemwa, Stephen Fashoto, Munyaradzi Mapingure, Rutendo Birri Makota, Ester Petrus
Format: Article
Language:English
Published: Springer 2025-04-01
Series:Discover Public Health
Subjects:
Online Access:https://doi.org/10.1186/s12982-025-00524-7
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850184752891428864
author Garikayi Chemhaka
Elliot Mbunge
Tafadzwa Dzinamarira
Godfrey Musuka
John Batani
Benhildah Muchemwa
Stephen Fashoto
Munyaradzi Mapingure
Rutendo Birri Makota
Ester Petrus
author_facet Garikayi Chemhaka
Elliot Mbunge
Tafadzwa Dzinamarira
Godfrey Musuka
John Batani
Benhildah Muchemwa
Stephen Fashoto
Munyaradzi Mapingure
Rutendo Birri Makota
Ester Petrus
author_sort Garikayi Chemhaka
collection DOAJ
description Abstract Anaemia affects approximately one-third of women of reproductive age globally, with the highest burden observed in resource-limited countries. Therefore, this study aimed to determine the socioeconomic and demographic factors associated with anaemia and predict anaemia among women in Zimbabwe. Using nationally representative, cross-sectional data from the 2015 Zimbabwe Demographic and Health Survey (DHS), a dataset from a sample of 5412 women of reproductive age was analyzed. The Chi-square test and multivariate logistic regression were employed to identify independent predictors of anaemia, while Elastic Net was used for feature importance scoring. To address the class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied. The prevalence of anaemia among women in Zimbabwe was 24.1%. Multivariate logistic regression revealed significant associations between anaemia and several factors, including older age (35–49 years) (adjusted Odds Ratio [aOR] = 1.31), marital status (being married) (aOR = 0.72), higher education (aOR = 0.47), middle household wealth (aOR = 1.32), professional occupation (aOR = 1.60), current use of modern contraceptives (aOR = 0.59), and overweight/obesity (aOR = 0.56). The highest burden was observed in Matabeleland South province (aOR = 3.44). Among prediction models, the random forest classifier outperformed K-Nearest Neighbors (KNN) and decision trees, achieving an accuracy of 74%, recall of 78%, F1-score of 75%, precision of 72%, and an Area Under the Curve (AUC) of 81.5%. Targeted interventions focusing on key socioeconomic and demographic characteristics could help reduce anaemia in women of reproductive age. Predictive models can aid healthcare practitioners in identifying at-risk individuals and implementing timely interventions to mitigate the impact of anaemia.
format Article
id doaj-art-85b6f8b9a4fc406f845cdbe412ae7604
institution OA Journals
issn 3005-0774
language English
publishDate 2025-04-01
publisher Springer
record_format Article
series Discover Public Health
spelling doaj-art-85b6f8b9a4fc406f845cdbe412ae76042025-08-20T02:16:56ZengSpringerDiscover Public Health3005-07742025-04-0122111710.1186/s12982-025-00524-7Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approachGarikayi Chemhaka0Elliot Mbunge1Tafadzwa Dzinamarira2Godfrey Musuka3John Batani4Benhildah Muchemwa5Stephen Fashoto6Munyaradzi Mapingure7Rutendo Birri Makota8Ester Petrus9Department of Statistics and Demography, Faculty of Social Sciences, University of EswatiniDivision of Research, Innovation and Engagement, Mangosuthu University of TechnologySchool of Health Systems and Public Health, University of PretoriaInternational Initiative for Impact EvaluationFaculty of Engineering and Technology, Botho UniversityDepartment of Computer Science, Faculty of Science and Engineering, University of EswatiniDepartment of Computer Science, Faculty of Science and Engineering, University of EswatiniICAP in ZimbabweDepartment of Biological Sciences and Ecology, University of ZimbabweSoftware Department, Faculty of ICT, International University of ManagementAbstract Anaemia affects approximately one-third of women of reproductive age globally, with the highest burden observed in resource-limited countries. Therefore, this study aimed to determine the socioeconomic and demographic factors associated with anaemia and predict anaemia among women in Zimbabwe. Using nationally representative, cross-sectional data from the 2015 Zimbabwe Demographic and Health Survey (DHS), a dataset from a sample of 5412 women of reproductive age was analyzed. The Chi-square test and multivariate logistic regression were employed to identify independent predictors of anaemia, while Elastic Net was used for feature importance scoring. To address the class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied. The prevalence of anaemia among women in Zimbabwe was 24.1%. Multivariate logistic regression revealed significant associations between anaemia and several factors, including older age (35–49 years) (adjusted Odds Ratio [aOR] = 1.31), marital status (being married) (aOR = 0.72), higher education (aOR = 0.47), middle household wealth (aOR = 1.32), professional occupation (aOR = 1.60), current use of modern contraceptives (aOR = 0.59), and overweight/obesity (aOR = 0.56). The highest burden was observed in Matabeleland South province (aOR = 3.44). Among prediction models, the random forest classifier outperformed K-Nearest Neighbors (KNN) and decision trees, achieving an accuracy of 74%, recall of 78%, F1-score of 75%, precision of 72%, and an Area Under the Curve (AUC) of 81.5%. Targeted interventions focusing on key socioeconomic and demographic characteristics could help reduce anaemia in women of reproductive age. Predictive models can aid healthcare practitioners in identifying at-risk individuals and implementing timely interventions to mitigate the impact of anaemia.https://doi.org/10.1186/s12982-025-00524-7AnaemiaMachine learningSurveySocioeconomicDemographicLogistic regression
spellingShingle Garikayi Chemhaka
Elliot Mbunge
Tafadzwa Dzinamarira
Godfrey Musuka
John Batani
Benhildah Muchemwa
Stephen Fashoto
Munyaradzi Mapingure
Rutendo Birri Makota
Ester Petrus
Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approach
Discover Public Health
Anaemia
Machine learning
Survey
Socioeconomic
Demographic
Logistic regression
title Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approach
title_full Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approach
title_fullStr Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approach
title_full_unstemmed Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approach
title_short Socioeconomic and demographic factors associated with anaemia among women of reproductive age in Zimbabwe: a supervised machine learning approach
title_sort socioeconomic and demographic factors associated with anaemia among women of reproductive age in zimbabwe a supervised machine learning approach
topic Anaemia
Machine learning
Survey
Socioeconomic
Demographic
Logistic regression
url https://doi.org/10.1186/s12982-025-00524-7
work_keys_str_mv AT garikayichemhaka socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT elliotmbunge socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT tafadzwadzinamarira socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT godfreymusuka socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT johnbatani socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT benhildahmuchemwa socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT stephenfashoto socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT munyaradzimapingure socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT rutendobirrimakota socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach
AT esterpetrus socioeconomicanddemographicfactorsassociatedwithanaemiaamongwomenofreproductiveageinzimbabweasupervisedmachinelearningapproach