An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors

Reports of Rift Valley fever (RVF), a highly climate-sensitive zoonotic disease, have been rather frequent in Kenya. Although multiple empirical analyses have shown that machine learning methods outperform time series models in forecasting time series data, there is limited evidence of their applica...

Full description

Saved in:
Bibliographic Details
Main Authors: Damaris Mulwa, Benedicto Kazuzuru, Gerald Misinzo, Benard Bett
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Big Data and Cognitive Computing
Subjects:
Online Access:https://www.mdpi.com/2504-2289/8/11/148
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850217160190722048
author Damaris Mulwa
Benedicto Kazuzuru
Gerald Misinzo
Benard Bett
author_facet Damaris Mulwa
Benedicto Kazuzuru
Gerald Misinzo
Benard Bett
author_sort Damaris Mulwa
collection DOAJ
description Reports of Rift Valley fever (RVF), a highly climate-sensitive zoonotic disease, have been rather frequent in Kenya. Although multiple empirical analyses have shown that machine learning methods outperform time series models in forecasting time series data, there is limited evidence of their application in predicting disease outbreaks in Africa. In recent times, the literature has reported several applications of machine learning in facilitating intelligent decision-making within the healthcare sector and public health. However, there is a scarcity of information regarding the utilization of the XGBoost model for predicting disease outbreaks. Within the provinces of Kenya, the incidence of Rift Valley fever was more prominent in the Rift Valley (26.80%) and Eastern (20.60%) regions. This study investigated the correlation between the occurrence of RVF (rapid vegetation failure) and several climatic variables, including humidity, clay content, elevation, slope, and rainfall. The correlation matrix revealed a modest linear dependence between different climatic variables and RVF cases, with the highest correlation, a mere 0.02903, observed for rainfall. The XGBoost model was trained using these climate variables and achieved outstanding performance measures including an AUC of 0.8908, accuracy of 99.74%, precision of 99.75%, and recall of 99.99%. The analysis of feature importance revealed that rainfall was the most significant predictor. These findings align with previous studies demonstrating the significance of weather conditions in RVF outbreaks. The study’s results indicate that incorporating advanced machine learning models that consider several climatic variables can significantly enhance the prediction and management of RVF incidence.
format Article
id doaj-art-adafd2608203440a993e4a027a720cf4
institution OA Journals
issn 2504-2289
language English
publishDate 2024-10-01
publisher MDPI AG
record_format Article
series Big Data and Cognitive Computing
spelling doaj-art-adafd2608203440a993e4a027a720cf42025-08-20T02:08:08ZengMDPI AGBig Data and Cognitive Computing2504-22892024-10-0181114810.3390/bdcc8110148An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic FactorsDamaris Mulwa0Benedicto Kazuzuru1Gerald Misinzo2Benard Bett3Department of Mathematics and Statistics, College of Natural and Applied Sciences, Sokoine University of Agriculture, P.O. Box 3038, Morogoro 67152, TanzaniaDepartment of Mathematics and Statistics, College of Natural and Applied Sciences, Sokoine University of Agriculture, P.O. Box 3038, Morogoro 67152, TanzaniaDepartment of Veterinary Microbiology, Parasitology and Biotechnology, College of Veterinary Medicine and Biomedical Sciences, Sokoine University of Agriculture, P.O. Box 3019, Morogoro 67152, TanzaniaInternational Livestock Research Institute, P.O. Box 30709, Nairobi 00100, KenyaReports of Rift Valley fever (RVF), a highly climate-sensitive zoonotic disease, have been rather frequent in Kenya. Although multiple empirical analyses have shown that machine learning methods outperform time series models in forecasting time series data, there is limited evidence of their application in predicting disease outbreaks in Africa. In recent times, the literature has reported several applications of machine learning in facilitating intelligent decision-making within the healthcare sector and public health. However, there is a scarcity of information regarding the utilization of the XGBoost model for predicting disease outbreaks. Within the provinces of Kenya, the incidence of Rift Valley fever was more prominent in the Rift Valley (26.80%) and Eastern (20.60%) regions. This study investigated the correlation between the occurrence of RVF (rapid vegetation failure) and several climatic variables, including humidity, clay content, elevation, slope, and rainfall. The correlation matrix revealed a modest linear dependence between different climatic variables and RVF cases, with the highest correlation, a mere 0.02903, observed for rainfall. The XGBoost model was trained using these climate variables and achieved outstanding performance measures including an AUC of 0.8908, accuracy of 99.74%, precision of 99.75%, and recall of 99.99%. The analysis of feature importance revealed that rainfall was the most significant predictor. These findings align with previous studies demonstrating the significance of weather conditions in RVF outbreaks. The study’s results indicate that incorporating advanced machine learning models that consider several climatic variables can significantly enhance the prediction and management of RVF incidence.https://www.mdpi.com/2504-2289/8/11/148machine learning modelsXGBoostaccuracygeo-climatic variablesAUC/ROC curves
spellingShingle Damaris Mulwa
Benedicto Kazuzuru
Gerald Misinzo
Benard Bett
An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors
Big Data and Cognitive Computing
machine learning models
XGBoost
accuracy
geo-climatic variables
AUC/ROC curves
title An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors
title_full An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors
title_fullStr An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors
title_full_unstemmed An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors
title_short An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors
title_sort xgboost approach to predictive modelling of rift valley fever outbreaks in kenya using climatic factors
topic machine learning models
XGBoost
accuracy
geo-climatic variables
AUC/ROC curves
url https://www.mdpi.com/2504-2289/8/11/148
work_keys_str_mv AT damarismulwa anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors
AT benedictokazuzuru anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors
AT geraldmisinzo anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors
AT benardbett anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors
AT damarismulwa xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors
AT benedictokazuzuru xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors
AT geraldmisinzo xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors
AT benardbett xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors