Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan

Abstract Sorghum cultivation plays a pivotal role in addressing food insecurity in South Sudan, but persistent conflict continues to impose challenges in the agriculture sector therefore understanding the impact of conflict on sorghum yield prediction is important for country food security. This res...

Full description

Saved in:
Bibliographic Details
Main Authors: John Karongo, Joseph Ivivi Mwaniki, John Ndiritu, Victor Mokaya
Format: Article
Language:English
Published: Nature Portfolio 2025-02-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-025-89030-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1823862287344074752
author John Karongo
Joseph Ivivi Mwaniki
John Ndiritu
Victor Mokaya
author_facet John Karongo
Joseph Ivivi Mwaniki
John Ndiritu
Victor Mokaya
author_sort John Karongo
collection DOAJ
description Abstract Sorghum cultivation plays a pivotal role in addressing food insecurity in South Sudan, but persistent conflict continues to impose challenges in the agriculture sector therefore understanding the impact of conflict on sorghum yield prediction is important for country food security. This research integrates various sources of data including sorghum yield from small scale farmers during four agricultural seasons (2018–2021), climate, remotely sensed data, and conflict occurrence probability to predict sorghum yield in South Sudan. We use five Machine Learning (ML) techniques, including Random Forest (RF), Decision Tree (DT), Extreme Gradient Boosting (XGboost), Support Vector Machine (SVM) and Artificial Neural Network (ANN) to predict 2021 end-of-season sorghum yield in conflict affected Upper Nile and Western Bahr El Gazal states. We computed correlations and the analysis revealed high variability in term of yield in the 2 states with an average sorghum yield of 366.03 kg/ha (SD = 292.29 kg/ha) and a strong positive correlation (0.75, p < 0.001) between cultivated land size and sorghum yield. During the training phase DT, RF, XGboost and ANN models showed high accuracy, with each having an R2 > 70%. DT and XGboost both had an accuracy close to 80% and less prediction error. Predicting 2021 sorghum yield, XGboost, DT and RF models yielded best combination of metrics with good accuracy. Our results reveal that adding conflict occurrence probability data to the models, while complex, had minimal impact on yield predictions. Further analysis revealed cultivated land size was the most significant predictor for all the models. This paper demonstrates that despite ongoing conflict, reasonably good end-of-season sorghum yield prediction with relevant food security planning implications could be done with ML, but challenges remain in generalizing these results due to limited crop data and regional variability in South Sudan.
format Article
id doaj-art-2a3e35dc910e457da94938723a5a1d0b
institution Kabale University
issn 2045-2322
language English
publishDate 2025-02-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-2a3e35dc910e457da94938723a5a1d0b2025-02-09T12:34:39ZengNature PortfolioScientific Reports2045-23222025-02-0115111610.1038/s41598-025-89030-zSorghum yield prediction based on remote sensing and machine learning in conflict affected South SudanJohn Karongo0Joseph Ivivi Mwaniki1John Ndiritu2Victor Mokaya3International Committee of the Red Cross, ICRC, Regional DelegationSchool of Mathematics, University of NairobiSchool of Mathematics, University of NairobiCSV Research ltdAbstract Sorghum cultivation plays a pivotal role in addressing food insecurity in South Sudan, but persistent conflict continues to impose challenges in the agriculture sector therefore understanding the impact of conflict on sorghum yield prediction is important for country food security. This research integrates various sources of data including sorghum yield from small scale farmers during four agricultural seasons (2018–2021), climate, remotely sensed data, and conflict occurrence probability to predict sorghum yield in South Sudan. We use five Machine Learning (ML) techniques, including Random Forest (RF), Decision Tree (DT), Extreme Gradient Boosting (XGboost), Support Vector Machine (SVM) and Artificial Neural Network (ANN) to predict 2021 end-of-season sorghum yield in conflict affected Upper Nile and Western Bahr El Gazal states. We computed correlations and the analysis revealed high variability in term of yield in the 2 states with an average sorghum yield of 366.03 kg/ha (SD = 292.29 kg/ha) and a strong positive correlation (0.75, p < 0.001) between cultivated land size and sorghum yield. During the training phase DT, RF, XGboost and ANN models showed high accuracy, with each having an R2 > 70%. DT and XGboost both had an accuracy close to 80% and less prediction error. Predicting 2021 sorghum yield, XGboost, DT and RF models yielded best combination of metrics with good accuracy. Our results reveal that adding conflict occurrence probability data to the models, while complex, had minimal impact on yield predictions. Further analysis revealed cultivated land size was the most significant predictor for all the models. This paper demonstrates that despite ongoing conflict, reasonably good end-of-season sorghum yield prediction with relevant food security planning implications could be done with ML, but challenges remain in generalizing these results due to limited crop data and regional variability in South Sudan.https://doi.org/10.1038/s41598-025-89030-zSelf-declared sorghum yieldConflictMachine learningSorghum bicolorRemote sensing
spellingShingle John Karongo
Joseph Ivivi Mwaniki
John Ndiritu
Victor Mokaya
Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan
Scientific Reports
Self-declared sorghum yield
Conflict
Machine learning
Sorghum bicolor
Remote sensing
title Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan
title_full Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan
title_fullStr Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan
title_full_unstemmed Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan
title_short Sorghum yield prediction based on remote sensing and machine learning in conflict affected South Sudan
title_sort sorghum yield prediction based on remote sensing and machine learning in conflict affected south sudan
topic Self-declared sorghum yield
Conflict
Machine learning
Sorghum bicolor
Remote sensing
url https://doi.org/10.1038/s41598-025-89030-z
work_keys_str_mv AT johnkarongo sorghumyieldpredictionbasedonremotesensingandmachinelearninginconflictaffectedsouthsudan
AT josephivivimwaniki sorghumyieldpredictionbasedonremotesensingandmachinelearninginconflictaffectedsouthsudan
AT johnndiritu sorghumyieldpredictionbasedonremotesensingandmachinelearninginconflictaffectedsouthsudan
AT victormokaya sorghumyieldpredictionbasedonremotesensingandmachinelearninginconflictaffectedsouthsudan