An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors
Reports of Rift Valley fever (RVF), a highly climate-sensitive zoonotic disease, have been rather frequent in Kenya. Although multiple empirical analyses have shown that machine learning methods outperform time series models in forecasting time series data, there is limited evidence of their applica...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-10-01
|
| Series: | Big Data and Cognitive Computing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2504-2289/8/11/148 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850217160190722048 |
|---|---|
| author | Damaris Mulwa Benedicto Kazuzuru Gerald Misinzo Benard Bett |
| author_facet | Damaris Mulwa Benedicto Kazuzuru Gerald Misinzo Benard Bett |
| author_sort | Damaris Mulwa |
| collection | DOAJ |
| description | Reports of Rift Valley fever (RVF), a highly climate-sensitive zoonotic disease, have been rather frequent in Kenya. Although multiple empirical analyses have shown that machine learning methods outperform time series models in forecasting time series data, there is limited evidence of their application in predicting disease outbreaks in Africa. In recent times, the literature has reported several applications of machine learning in facilitating intelligent decision-making within the healthcare sector and public health. However, there is a scarcity of information regarding the utilization of the XGBoost model for predicting disease outbreaks. Within the provinces of Kenya, the incidence of Rift Valley fever was more prominent in the Rift Valley (26.80%) and Eastern (20.60%) regions. This study investigated the correlation between the occurrence of RVF (rapid vegetation failure) and several climatic variables, including humidity, clay content, elevation, slope, and rainfall. The correlation matrix revealed a modest linear dependence between different climatic variables and RVF cases, with the highest correlation, a mere 0.02903, observed for rainfall. The XGBoost model was trained using these climate variables and achieved outstanding performance measures including an AUC of 0.8908, accuracy of 99.74%, precision of 99.75%, and recall of 99.99%. The analysis of feature importance revealed that rainfall was the most significant predictor. These findings align with previous studies demonstrating the significance of weather conditions in RVF outbreaks. The study’s results indicate that incorporating advanced machine learning models that consider several climatic variables can significantly enhance the prediction and management of RVF incidence. |
| format | Article |
| id | doaj-art-adafd2608203440a993e4a027a720cf4 |
| institution | OA Journals |
| issn | 2504-2289 |
| language | English |
| publishDate | 2024-10-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Big Data and Cognitive Computing |
| spelling | doaj-art-adafd2608203440a993e4a027a720cf42025-08-20T02:08:08ZengMDPI AGBig Data and Cognitive Computing2504-22892024-10-0181114810.3390/bdcc8110148An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic FactorsDamaris Mulwa0Benedicto Kazuzuru1Gerald Misinzo2Benard Bett3Department of Mathematics and Statistics, College of Natural and Applied Sciences, Sokoine University of Agriculture, P.O. Box 3038, Morogoro 67152, TanzaniaDepartment of Mathematics and Statistics, College of Natural and Applied Sciences, Sokoine University of Agriculture, P.O. Box 3038, Morogoro 67152, TanzaniaDepartment of Veterinary Microbiology, Parasitology and Biotechnology, College of Veterinary Medicine and Biomedical Sciences, Sokoine University of Agriculture, P.O. Box 3019, Morogoro 67152, TanzaniaInternational Livestock Research Institute, P.O. Box 30709, Nairobi 00100, KenyaReports of Rift Valley fever (RVF), a highly climate-sensitive zoonotic disease, have been rather frequent in Kenya. Although multiple empirical analyses have shown that machine learning methods outperform time series models in forecasting time series data, there is limited evidence of their application in predicting disease outbreaks in Africa. In recent times, the literature has reported several applications of machine learning in facilitating intelligent decision-making within the healthcare sector and public health. However, there is a scarcity of information regarding the utilization of the XGBoost model for predicting disease outbreaks. Within the provinces of Kenya, the incidence of Rift Valley fever was more prominent in the Rift Valley (26.80%) and Eastern (20.60%) regions. This study investigated the correlation between the occurrence of RVF (rapid vegetation failure) and several climatic variables, including humidity, clay content, elevation, slope, and rainfall. The correlation matrix revealed a modest linear dependence between different climatic variables and RVF cases, with the highest correlation, a mere 0.02903, observed for rainfall. The XGBoost model was trained using these climate variables and achieved outstanding performance measures including an AUC of 0.8908, accuracy of 99.74%, precision of 99.75%, and recall of 99.99%. The analysis of feature importance revealed that rainfall was the most significant predictor. These findings align with previous studies demonstrating the significance of weather conditions in RVF outbreaks. The study’s results indicate that incorporating advanced machine learning models that consider several climatic variables can significantly enhance the prediction and management of RVF incidence.https://www.mdpi.com/2504-2289/8/11/148machine learning modelsXGBoostaccuracygeo-climatic variablesAUC/ROC curves |
| spellingShingle | Damaris Mulwa Benedicto Kazuzuru Gerald Misinzo Benard Bett An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors Big Data and Cognitive Computing machine learning models XGBoost accuracy geo-climatic variables AUC/ROC curves |
| title | An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors |
| title_full | An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors |
| title_fullStr | An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors |
| title_full_unstemmed | An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors |
| title_short | An XGBoost Approach to Predictive Modelling of Rift Valley Fever Outbreaks in Kenya Using Climatic Factors |
| title_sort | xgboost approach to predictive modelling of rift valley fever outbreaks in kenya using climatic factors |
| topic | machine learning models XGBoost accuracy geo-climatic variables AUC/ROC curves |
| url | https://www.mdpi.com/2504-2289/8/11/148 |
| work_keys_str_mv | AT damarismulwa anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors AT benedictokazuzuru anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors AT geraldmisinzo anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors AT benardbett anxgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors AT damarismulwa xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors AT benedictokazuzuru xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors AT geraldmisinzo xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors AT benardbett xgboostapproachtopredictivemodellingofriftvalleyfeveroutbreaksinkenyausingclimaticfactors |